Re: [Python-Dev] Py2.6 release schedule

2008-02-11 Thread Travis E. Oliphant
Christian Heimes wrote:
 Guido van Rossum wrote:
   
 I do think that another (final) 3.0 alpha before PyCon would also be a
 good idea. This way we can gel the release some more. For 2.6 I think
 we'll need more alpha releases after PyCon; I doubt the backporting
 from 3.0 (which has only started seriously quite recently) will be
 done by PyCon.
 

 I've back ported class decorators (http://bugs.python.org/issue1759).
 Two tests are failing and I need some help to solve the riddle.

 Several back ports like the bytearray type and the new io module depend
 on a back port of the new buffer protocol. Travis, can you please
 increase your priority on the port of your PEP to 2.6?

   
Yes, I will.  What are your time-lines?  I've been targeting first week 
in March.

-Travis


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3118: Extended buffer protocol (new version)

2007-04-18 Thread Travis E. Oliphant
Jim Jewett wrote:
 Reading this message without the entire PEP in front of me showed some
 confusing usage.  (Details below)  Most (but not all) I could resolve
 from the PEP itself, but they could be clarified with different
 constant names.
 

I'm going to adapt some suggestions made by you and Carl Banks.  Look 
for an updated flags section of the PEP shortly.

-Travis
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Access to Python SVN

2007-04-16 Thread Travis E. Oliphant
I'd like to ask for access to Python SVN so that I can keep the PEP 3118 
up to date as well as to eventually make the changes needed for 
implementing the extended buffer protocol.

I have quite a bit of experience with the Python C-API and understand 
many parts of the code base fairly well (though I would not claim to be 
an expert on all of it).

I promise to only adjust the PEP until such time as patches to implement 
the extended buffer protocol are approved.

I will email my public SSH key to the appropriate place.

Thank you very much,

-Travis Oliphant




___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] An updated extended buffer PEP

2007-03-27 Thread Travis E. Oliphant
Greg Ewing wrote:
 Here's another idea, to allow multiple views of the same
 buffer with different shape/stride info to coexist, but
 without extra provider objects or refcount weirdness.
 Also it avoids using calls with a brazillion arguments.
 
struct bufferinfo {
  void **buf;
  Py_ssize_t *len;
  int *writeable;
  char **format;
  int *ndims;
  Py_ssize_t **shape;
  Py_ssize_t **strides;
  int **isptr;
};
 
int (*getbuffer)(PyObject *obj, struct bufferinfo *info);
 
int (*releasebuffer)(PyObject *obj, struct bufferinfo *info);


This is not much different from my original view object.  Stick a 
PyObject_HEAD at the start of this bufferinfo and you have it.

Memory management was the big reason I wanted to do something like this.

I don't see why a PyObject_HEAD would make anything significantly 
slower.  Then we could use Python's memory management very easily to 
create and destroy these things.  This bufferinfo object would become 
the provider I was talking about.

 If the object has constant shape/stride info, it just fills
 in the info struct with pointers to its own memory, and does
 nothing when releasebuffer is called (other than unlocking
 its buffer).
 
 If its shape/stride info can change, it mallocs memory for
 them and copies them into the info struct. When releasebuffer
 is called, it frees this memory.

 
 It is the responsibility of the consumer to ensure that the
 base object remains alive until releasebuffer has been called
 on the info struct (to avoid leaking any memory that has
 been malloced for shapes/strides).

This is a reasonable design choice.  I actually prefer to place all the 
buffer information in a single object rather than the multiple argument 
design because it scales better and is easier to explain and understand.

-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] An updated extended buffer PEP

2007-03-27 Thread Travis E. Oliphant
Carl Banks wrote:
 Travis Oliphant wrote:
 Travis Oliphant wrote:
 Hi Carl and Greg,

 Here is my updated PEP which incorporates several parts of the 
 discussions we have been having.
 And here is the actual link:

 http://projects.scipy.org/scipy/numpy/browser/trunk/numpy/doc/pep_buffer.txt 
 
 
 What's the purpose of void** segments in PyObject_GetBuffer?  It seems 
 like it's leftover from an older incarnation?
 

Yeah, I forgot to change that location.

 I'd hope after more recent discussion, we'll end up simplifying 
 releasebuffer.  It seems like it'd be a nightmare to keep track of what 
 you've released.

Yeah, I agree.   I think I'm leaning toward the bufferinfo structure 
which allows the exporter to copy memory for things that it wants to be 
free to change while the buffer is exported.

 
 
 Finally, the isptr thing.  It's just not sufficient.  Frankly, I'm 
 having doubts whether it's a good idea to support multibuffer at all. 
 Sure, it brings generality, but I'm thinking its too hard to explain and 
 too hard to get one's head around, and will lead to lots of 
 misunderstanding and bugginess.  OTOH, it really doen't burden anyone 
 except those who want to export multi-buffered arrays, and we only have 
 one shot to do it.  I just hope it doesn't confuse everyone so much that 
 no one bothers.

People used to have doubts about explaining strides in NumPy as well.  I 
sure would have hated to see them eliminate the possiblity because of 
those doubts. I think the addition you discuss is not difficult once you 
get a hold of it.

I also understand now why subbufferoffsets is needed.  I was thinking 
that for slices you would just re-create a whole other array of pointers 
to contain that addition.   But, that is really not advisable.  It makes 
sense when you are talking about a single pointer variable (like in 
NumPy) but it doesn't when you have an array of pointers.

Providing the example about how to extract the pointer from the returned 
information goes a long way towards clearing up any remaining confusion.

Your ImageObject example is also helpful.   I really like the addition 
and think it is clear enough and supports a lot of use cases with very 
little effort.

 
 Here's how I would update the isptr thing.  I've changed derefoff to 
 subbufferoffsets to describe it better.
 
 
 typedef PyObject *(*getbufferproc)(PyObject *obj, void **buf,
 Py_ssize_t *len, int *writeable,
 char **format, int *ndims,
 Py_ssize_t **shape,
 Py_ssize_t **strides,
 Py_ssize_t **subbufferoffsets);
 
 
 subbufferoffsets
 
Used to export information about multibuffer arrays.  It is an
address of a ``Py_ssize_t *`` variable that will be set to point at
an array of ``Py_ssize_t`` of length ``*ndims``.
 
[I don't even want to try a verbal description.]
 
To demonstrate how subbufferoffsets works, here is am example of a
function that returns a pointer to an element of ANY N-dimensional
array, single- or multi-buffered.
 
 void* get_item_pointer(int ndim, void* buf, Py_ssize_t* strides,
  Py_ssize_t* subarrayoffs, Py_ssize_t *indices) {
  char* pointer = (char*)buf;
  int i;
  for (i = 0; i  ndim; i++) {
  pointer += strides[i]*indices[i];
  if (subarraysoffs[i] = 0) {
  pointer = *(char**)pointer + subarraysoffs[i];
  }
  }
  return (void*)pointer;
  }
 
For single buffers, subbufferoffsets is negative in every dimension
and it reduces to normal single-buffer indexing.  

What about just having subbufferoffsets be NULL in this case?  i.e. you 
don't need it.If some of the dimensions did not need dereferencing 
then they would be negative (how about we just say -1 to be explicit)?

For multi-buffers,
subbufferoffsets indicates when to dereference the pointer and switch
to the new buffer, and gives the offset into the buffer to start at.
In most cases, the subbufferoffset would be zero (indicating it should
start at the beginning of the new buffer), but can be a positive
number if the following dimension has been sliced, and thus the 0th
entry in that dimension would not be at the beginning of the new
buffer.
 
 
 
 Other than that, looks good. :)
 

I think we are getting closer.   What do you think about Greg's idea of 
basically making the provider the bufferinfo structure and having the 
exporter handle copying memory over for shape and strides if it wants to 
be able to change those before the lock is released.

-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 

[Python-Dev] The latest extended buffer PEP

2007-03-27 Thread Travis E. Oliphant

The latest update is here.  Carl and Greg, can I add your names to the 
PEP author list?

I think we are very close.  I'd like to start working on the 
implmentation.  The modifications to the struct module is probably where 
I'll start.

I really like the possibilities this will open up for sharing of video, 
images, audio, databases, between different objects.  Algorithms could 
be written that are object agnostic and work for any object exporting 
the buffer interface.

Are we ready for a pronouncement?

-Travis



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP: Extending the buffer protocol to share array information.

2006-11-01 Thread Travis E. Oliphant
Fredrik Lundh wrote:
 Terry Reedy wrote:
 
 I believe that at present PyGame can only work with external images that it 
 is programmed to know how to import.  My guess is that if image source 
 program X (such as PIL) described its data layout in a way that NumPy could 
 read and act on, the import/copy step could be eliminated.
 
 I wish you all stopped using PIL as an example in this discussion;
 for PIL 2, I'm moving towards an entirely opaque data model, with a 
 data view-style client API.

That's an un-reasonable request.  The point of the buffer protocol 
allows people to represent their data in whatever way they like 
internally but still share it in a standard way.  The extended buffer 
protocol allows sharing of the shape of the data and its format in a 
standard way as well.

We just want to be able to convert the data in PIL objects to other 
Python objects without having to write special converter functions. 
It's not important how PIL or PIL 2 stores the data as long as it 
participates in the buffer protocol.

Of course if the memory layout were compatible with the model of NumPy, 
then data-copies would not be required, but that is really secondary.

-Travis


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] idea for data-type (data-format) PEP

2006-11-01 Thread Travis E. Oliphant

Thanks for all the comments that have been given on the data-type 
(data-format) PEP.  I'd like opinions on an idea for revising the PEP I 
have.

What if we look at this from the angle of trying to communicate 
data-formats between different libraries (not change the way anybody 
internally deals with data-formats).

For example, ctypes has one way to internally deal with data-formats 
(using type objects).

NumPy/Numeric has a way to internally deal with data-formats (using 
PyArray_Descr * structure -- in Numeric it's just a C-structure but in 
NumPy it's fleshed out further and also a Python object called the 
data-type).

Numarray has a way to internally deal with data-formats (using type 
objects).

The array module has a way to internally deal with data-formats (using a 
PyArray_Descr * structure -- and character codes to select one).

The struct module deals with data-formats using character codes.

The PIL deals with data-formats using image modes.

PyVTK deals with data-formats using it's own internal objects.

MPI deals with data-formats using it's own MPI_DataType structures.

This list goes on and on.

What I claim is needed in Python (to make it better glue) is to have a 
standard way to communicate data-format information between these 
extensions.  Then, you don't have to build in support for all the 
different ways data-formats are represented by different libraries.  The 
library only has to be able to translate their representation to the 
standard way that Python uses to represent data-format.

How is this goal going to be achieved?  That is the real purpose of the 
data-type object I previously proposed.

Nick showed that there are two (non-orthogonal) ways to think about this 
goal.

1) We could define a special string-syntax (or list syntax) that covers 
every special case.  The array interface specification goes this 
direction and it requires no new Python types.  This could also be seen 
as an extension of the struct module to allow for nested structures, etc.

2) We could define a Python object that specifically carries data-format 
information.


There is also a third way (or really 2b) that has been mentioned:  take 
one of the extensions and use what it does to communicate data-format 
between objects and require all other extensions to conform to that 
standard.

The problem with 2b is that what works inside an extension module may 
not be the best option when it comes to communicating across multiple 
extension modules.   Certainly none of the extension modules have argued 
that case effectively.

Does that explain the goal of what I'm trying to do better?





___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] idea for data-type (data-format) PEP

2006-11-01 Thread Travis E. Oliphant
Travis E. Oliphant wrote:
 Thanks for all the comments that have been given on the data-type 
 (data-format) PEP.  I'd like opinions on an idea for revising the PEP I 
 have.

 
 1) We could define a special string-syntax (or list syntax) that covers 
 every special case.  The array interface specification goes this 
 direction and it requires no new Python types.  This could also be seen 
 as an extension of the struct module to allow for nested structures, etc.
 
 2) We could define a Python object that specifically carries data-format 
 information.
 
 
 Does that explain the goal of what I'm trying to do better?

In other-words, what I'm saying is I really want a PEP that does this. 
Could we have a discussion about what the best way to communicate 
data-format information across multiple extension modules would look 
like.  I'm not saying my (pre-)PEP is best.  The point of putting it in 
it's infant state out there is to get the discussion rolling, not to 
claim I've got all the answers.

It seems like there are enough people who have dealt with this issue 
that we ought to be able to put something very useful together that 
would make Python much better glue.


-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] idea for data-type (data-format) PEP

2006-11-01 Thread Travis E. Oliphant
Martin v. Löwis wrote:
 Travis E. Oliphant schrieb:
 What if we look at this from the angle of trying to communicate 
 data-formats between different libraries (not change the way anybody 
 internally deals with data-formats).
 
 ISTM that this is not the right approach. If the purpose of the datatype
 object is just to communicate the layout in the extended buffer
 interface, then it should be specified in that PEP, rather than being
 stand-alone, and it should not pretend to serve any other purpose.

I'm actually quite fine with that.  If that is the consensus, then I 
will just go that direction.   ISTM though that since we are putting 
forth the trouble inside the extended buffer protocol we might as well 
be as complete as we know how to be.

 Or, if it does have uses independent of the buffer extension: what
 are those uses?

So that NumPy and ctypes and audio libraries and video libraries and 
database libraries and image-file format libraries can communicate about 
data-formats using the same expressions (in Python).

Maybe we decide that ctypes-based expressions are a very good way to 
communicate about those things in Python for all other packages.  If 
that is the case, then I argue that we ought to change the array module, 
and the struct module to conform (of course keeping the old ways for 
backward compatibility) and set the standard for other packages to follow.

What problem do you have in defining a standard way to communicate about 
binary data-formats (not just images)?  I still can't figure out why you 
are so resistant to the idea.  MPI had to do it.

 
 1) We could define a special string-syntax (or list syntax) that covers 
 every special case.  The array interface specification goes this 
 direction and it requires no new Python types.  This could also be seen 
 as an extension of the struct module to allow for nested structures, etc.

 2) We could define a Python object that specifically carries data-format 
 information.
 
 To distinguish between these, convenience of usage (and of construction)
 should have to be taken into account. At least for the preferred
 alternative, but better for the runners-up, too, there should be a
 demonstration on how existing modules have to be changed to support it
 (e.g. for the struct and array modules as producers; not sure what
 good consumer code would be).

Absolutely --- if something is to be made useful across packages and 
from Python.   This is where the discussion should take place.  The 
struct module and array modules would both be consumers also so that in 
the struct module you could specify your structure in terms of the 
standard data-represenation and in the array module you could specify 
your array in terms of the standard representation instead of using 
character codes.

 
 Suppose I wanted to change all RGB values to a gray value (i.e. R=G=B),
 what would the C code look like that does that? (it seems now that the
 primary purpose of this machinery is image manipulation)
 

For me it is definitely not image manipulation that is the only purpose 
(or even the primary purpose).  It's just an easy one to explain --- 
most people understand images).   But, I think this question is actually 
irrelevant (IMHO).  To me, how you change all RGB values to gray would 
depend on the library you are using not on how data-formats are expressed.

Maybe we are still mis-understanding each other.


If you really want to know.  In NumPy it might look like this:

Python code:

img['r'] = img['g']
img['b'] = img['g']

C-code:

use the Python C-API to do essentially the same thing as above or

to do
img['r'] = img['g']

dtype = img-descr;
r_field = PyDict_GetItemString(dtype,'r');
g_field = PyDict_GetItemString(dtype,'g');
r_field_dtype = PyTuple_GET_ITEM(r_field, 0);
r_field_offset = PyTuple_GET_ITEM(r_field, 1);
g_field_dtype = PyTuple_GET_ITEM(g_field, 0);
g_field_offset = PyTuple_GET_ITEM(g_field, 1);
obj = PyArray_GetField(img, g_field, g_field_offset);
Py_INCREF(r_field)
PyArray_SetField(img, r_field, r_field_offset, obj);

But, I still don't see how that is relevant to the question of how to 
represent the data-format to share that information across two extensions.


 The problem with 2b is that what works inside an extension module may 
 not be the best option when it comes to communicating across multiple 
 extension modules.   Certainly none of the extension modules have argued 
 that case effectively.
 
 I think there are two ways in which one option could be better than
 the other: it might be more expressive, and it might be easier to use.
 For the second aspect (ease of use), there are two subways: it might
 be easier to produce, or it might be easier to consume.

I like this as a means to judge a data-format representation. Let me 
summarize to see if I understand:

1) Expressive (does it express every data-format you might want or need)
2) Ease of use
a) Production: How easy is it to create the representation.
b

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-11-01 Thread Travis E. Oliphant
Jim Jewett wrote:
 I'm still not sure exactly what is missing from ctypes.  To make this 
 concrete:

I was too hasty.  There are some things actually missing from ctypes:

1) long double (this is not the same across platforms, but it is a 
data-type).
2) complex-valued types (you might argue that it's just a 2-array of 
floats, but you could say the same thing about int as an array of 
bytes).  The point is how do people interpret the data.  Complex-valued 
data-types are very common.  It is one reason Fortran is still used by 
scientists.
3) Unicode characters (there is w_char support but I mean a way to 
describe what kind of unicode characters you have in a cross-platform 
way).  I actually think we have a way to describe encodings in the 
data-format representation as well.

4) What about floating-point representations that are not IEEE 754 
4-byte or 8-byte.   There should be a way to at least express the 
data-format in these cases (this is actually how long double should be 
handled as well since it varies across platforms what is actually done 
with the extra bits).

So, we can't just use ctypes as a complete data-format representation 
because it's also missing some things.

What we need is a standard way for libraries that deal with data-formats 
to communicate with each other.  I need help with a PEP like this and 
that's what I'm asking for.  It's all I've really been after all along.

A couple of points:

* One reason to support the idea of the Python object approach (versus a 
string-syntax) is that it is already parsed.  A list-syntax approach 
(perhaps built from strings for fundamental data-types) might also be 
considered already parsed as well.

* One advantage of using kind versus a character for every type (like 
struct and array do) is that it helps consumers and producers speed up 
the parser (a fuller branching tree).


-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-30 Thread Travis E. Oliphant
Travis Oliphant wrote:
 Greg Ewing wrote:
 Travis Oliphant wrote:


 Part of the problem is that ctypes uses a lot of different Python types 
 (that's what I mean by multi-object to accomplish it's goal).  What 
 I'm looking for is a single Python type that can be passed around and 
 explains binary data.

 It's not clear that multi-object is a bad thing in and
 of itself. It makes sense conceptually -- if you have
 a datatype object representing a struct, and you ask
 for a description of one of its fields, which could
 be another struct or array, you would expect to get
 another datatype object describing that.

Yes, exactly.  This is what the Python type I'm proposing does as well. 
   So, perhaps we are misunderstanding each other.  The difference is 
that data-types are instances of the data-type (data-format) object 
instead of new Python types (as they are in ctypes).
 
 I've tried to clarify this in another post.  Basically, what I don't 
 like about the ctypes approach is that it is multi-type (every new 
 data-format is a Python type).
 

I should clarify that I have no opinion about the ctypes approach for 
what ctypes does with it.  I like ctypes and have adapted NumPy to make 
it easier to work with ctypes.

I'm saying that I don't like the idea of forcing this approach on 
everybody else who wants to describe arbitrary binary data just because 
ctypes is included.  Now, if it is shown that it is indeed better than a 
simpler instances-of-a-single-type approach that I'm basically proposing 
  then I'll be persuaded.

However, the existence of an alternative strategy using a single Python 
type and multiple instances of that type to describe binary data (which 
is the NumPy approach and essentially the array module approach) means 
that we can't just a-priori assume that the way ctypes did it is the 
only or best way.

The examples of missing features that Martin has exposed are not 
show-stoppers.  They can all be easily handled within the context of 
what is being proposed.   I can modify the PEP to show this.  But, I 
don't have the time to spend if it's just all going to be rejected in 
the end.  I need some encouragement in order to continue to invest 
energy in pushing this forward.

-Travis


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-30 Thread Travis E. Oliphant
Greg Ewing wrote:
 Travis Oliphant wrote:
 
 The 'bit' type re-intprets the size information to be in units of bits 
 and so implies a bit-field instead of another data-format.
 
 Hmmm, okay, but now you've got another orthogonality
 problem, because you can't distinguish between e.g.
 a 5-bit signed int field and a 5-bit unsigned int
 field.

Good point.

 
 It might be better not to consider bit to be a
 type at all, and come up with another way of indicating
 that the size is in bits. Perhaps
 
 'i4'   # 4-byte signed int
 'i4b'  # 4-bit signed int
 'u4'   # 4-byte unsigned int
 'u4b'  # 4-bit unsigned int
 

I like this.  Very nice.  I think that's the right way to look at it.

 (Next we can have an argument about whether bit
 fields should be packed MSB-to-LSB or vice versa...:-)

I guess we need another flag / attribute to indicate that.

The other thing that needs to be discussed at some point may be a way to 
indicate the floating-point format.  I've basically punted on this and 
just meant 'f' to mean platform float

Thus, you can't use the data-type object to pass information between two 
platforms that don't share a common floating point representation.

-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] PEP: Extending the buffer protocol to share array information.

2006-10-30 Thread Travis E. Oliphant


Attached is my PEP for extending the buffer protocol to allow array data 
to be shared.



PEP: unassigned
Title: Extending the buffer protocol to include the array interface
Version: $Revision: $
Last-Modified: $Date:  $
Author: Travis Oliphant [EMAIL PROTECTED]
Status: Draft
Type: Standards Track
Created: 28-Aug-2006
Python-Version: 2.6

Abstract

This PEP proposes extending the tp_as_buffer structure to include 
function pointers that incorporate information about the intended
shape and data-format of the provided buffer.  In essence this will
place something akin to the array interface directly into Python. 

Rationale

Several extensions to Python utilize the buffer protocol to share
the location of a data-buffer that is really an N-dimensional
array.  However, there is no standard way to exchange the
additional N-dimensional array information so that the data-buffer
is interpreted correctly.  The NumPy project introduced an array
interface (http://numpy.scipy.org/array_interface.shtml) through a
set of attributes on the object itself.  While this approach
works, it requires attribute lookups which can be expensive when
sharing many small arrays.  

One of the key reasons that users often request to place something
like NumPy into the standard library is so that it can be used as
standard for other packages that deal with arrays.  This PEP
provides a mechanism for extending the buffer protocol (which
already allows data sharing) to add the additional information
needed to understand the data.  This should be of benefit to all
third-party modules that want to share memory through the buffer
protocol such as GUI toolkits, PIL, PyGame, CVXOPT, PyVoxel,
PyMedia, audio libraries, video libraries etc.


Proposal
 
Add a bf_getarrayinfo function pointer to the buffer protocol to
allow objects to share additional information about the returned
memory pointer.  Add the TP_HAS_EXT_BUFFER flag to types that
define the extended buffer protocol. 

Specification:

static int 

bf_getarrayinfo (PyObject *obj, Py_intptr_t **shape, 
 Py_intptr_t **strides, PyObject **dataformat)
   
Inputs:  
 obj -- The Python object being questioned.
 
Outputs: 
 
 [function result] -- the number of dimensions (n)

 *shape -- A C-array of 'n' integers indicating the
  shape of the array. Can be NULL if n==0.

 *strides -- A C-array of 'n' integers indicating
the number of bytes to jump to get to the next
element in each dimension. Can be NULL if the 
array is C-contiguous (or n==0).

 *dataformat -- A Python object describing the data-format
each element of the array should be
interpreted as.

   
Discussion Questions:

1) How is data-format information supposed to be shared?  A companion
proposal suggests returning a data-format object which carries the
information about the buffer area. 

2) Should the single function pointer call be extended into
multiple calls or should it's arguments be compressed into a structure
that is filled?

3) Should a C-API function(s) be created which wraps calls to this function
pointer much like is done now with the buffer protocol?  What should
the interface of this function (or these functions) be.

4) Should a mask (for missing values) be shared as well? 

Reference Implementation

Supplied when the PEP is accepted. 

Copyright

This document is placed in the public domain.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-30 Thread Travis E. Oliphant
M.-A. Lemburg wrote:
 Travis E. Oliphant wrote:
 
 I understand and that's why I'm asking why you made the range
 explicit in the definition.
 

In the case of NumPy it was so that String and Unicode arrays would both 
look like multi-length string character arrays and not arrays of 
arrays of some character.

But, this can change in the data-format object.  I can see that the 
Unicode description needs to be improved.

 The definition should talk about Unicode code points.
 The number of bytes then determines whether you can only
 represent the ASCII subset (1 byte), UCS2 (2 bytes, BMP only)
 or UCS4 (4 bytes, all currently assigned code points).

Yes, you are correct.  A string of unicode characters should really be 
represented in the same way that an array of integers is represented for 
a data-format object.

-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-29 Thread Travis E. Oliphant
Greg Ewing wrote:
 Nick Coghlan wrote:
 I'd say the answer to where we put it will be dependent on what happens to 
 the 
 idea of adding a NumArray style fixed dimension array type to the standard 
 library. If that gets exposed through the array module as array.dimarray, 
 then 
 it would make sense to expose the associated data layout descriptors as 
 array.datatype.
 
 Seem to me that arrays are a sub-concept of binary data,
 not the other way around. So maybe both arrays and data
 types should be in a module called 'binary' or some such.

Yes, very good point.

That's probably one reason I'm proposing the data-type first before the 
array interface in the extended buffer protocol.

-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-29 Thread Travis E. Oliphant
Greg Ewing wrote:
 Travis E. Oliphant wrote:
 
 The 'kind' does not specify how big the data-type (data-format) is.
 
 What exactly does bit mean in that context?   

Do you mean big ?  It's how many bytes the kind is using.

So, 'u4' is a 4-byte unsigned integer and 'u2' is a 2-byte unsigned 
integer.


-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-29 Thread Travis E. Oliphant
Greg Ewing wrote:
 Nick Coghlan wrote:
 
 Greg Ewing wrote:
 
 Also, what if I want to refer to fields by name
 but don't want to have to work out all the offsets
 
 Use the list definition form. With the changes I've 
 suggested above, you wouldn't even have to name the fields you don't 
 care about - just describe them.
 
 That would be okay.
 
 I still don't see a strong justification for having a
 one-big-string form as well as a list/tuple/dict form,
 though.

Compaction of representation is all. It's used quite a bit in numarray, 
   which is where most of the 'kind' names came from as well.   When you 
don't want to name fields it is a really nice feature (but it doesn't 
nest well).

-Travis


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-29 Thread Travis E. Oliphant
Greg Ewing wrote:
 Travis E. Oliphant wrote:
 
 How to handle unicode data-formats could definitely be improved. 
 Suggestions are welcome.
 
 'U4*10'  string of 10 4-byte Unicode chars
 

I like that.  Thanks.

-Travis


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-29 Thread Travis E. Oliphant
Martin v. Löwis wrote:
 Travis E. Oliphant schrieb:
 What is needed is a definitive way to describe data and then have

 array
 struct
 ctypes

 all be compatible with that same method.That's why I'm proposing the 
 PEP.  It's a unification effort not yet-another-method.
 
 As I unification mechanism, I think it is insufficient. I doubt it
 can express all the concepts that ctypes supports.
 

Please clarify what you mean.

Are you saying that a single object can't carry all the information 
about binary data that ctypes allows with it's multi-object approach?

I don't agree with you, if that is the case.  Sure, perhaps I've not 
included certain cases, so give an example.

Besides, I don't think this is the right view of unification.  I'm not 
saying that ctypes should get rid of it's many objects used for 
interfacing with C-functions.

I'm saying we should introduce a single-object mechanism for describing 
binary data so that the many-object approach of c-types does not become 
some kind of de-facto standard.  C-types can translate this 
object-instance to its internals if and when it needs to.

In the mean-time, how are other packages supposed to communicate binary 
information about data with each other?

Remember the context that the data-format object is presented in.  Two 
packages need to share a chunk of memory (the package authors do not 
know each other and only have and Python as a common reference).  They 
both want to describe that the memory they are sharing has some 
underlying binary structure.

How do they do that? Please explain to me how the buffer protocol can be 
extended so that information about what is in the memory can be shared 
without a data-format object?

-Travis


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-29 Thread Travis E. Oliphant
Martin v. Löwis wrote:
 Travis E. Oliphant schrieb:
 How to handle unicode data-formats could definitely be improved. 
 
 As before, I'm doubtful what the actual needs are. For example, is
 it desired to support generation of ID3v2 tags with such a data
 format? The tag is specified here:
 

Perhaps I was not clear enough about what I'm try to do.   For a long 
time a lot of people have wanted something like Numeric in Python 
itself.  There have been many hurdles to that goal.

After discussions at SciPy 2006 with Guido, we decided that the best way 
to proceed at this point was to extend the buffer protocol to allow 
packages to share array-like information with each-other.

There are several things missing from the buffer protocol that NumPy 
needs in order to be able to really understand the (fixed-size) memory 
another package has allocated and is sharing.

The most important of these is

1) Shape information
2) Striding information
3) Data-format information  (how is each element perceived).

Shape and striding information can be shared with a C-array of integers.

How is data-format information supposed to be shared?

We've come up with a very flexible way to do this in NumPy using a 
single Python object.  This Python object supports describing the layout 
of any fixed-size chunk of memory (right now in units of bytes --- bit 
fields could be added, though).

I'm proposing to add this object to Python so that the buffer protcol 
has a fast and efficient way to share #3.   That's really all I'm after.

It also bothers me that so many ways to describe binary data are being 
used out there.  This is a problem that deserves being solved.  And, no, 
ctypes hasn't solved it (we can't directly use the ctypes solution). 
Perhaps this PEP doesn't hit all the corners, but a data-format object 
*is* a useful thing to consider.

The array object in Python already has a PyArray_Descr * structure that 
is a watered-down version of what I'm talking about.   In fact, this is 
what Numeric built from (or vice-versa actually).  And NumPy has greatly 
enhanced this object for any conceivable structure.

Guido seemed to think the data-type objects were nice when he saw them 
at SciPy 2006, and so I'm presenting a PEP.

Without the data-format object, I'm don't know how to extend the buffer 
protocol to communicate data-format information.  Do you have a better 
idea?

I have no trouble limiting the data-type object to the buffer protocol 
extension PEP, but I do think it could gain wider use.

 
 Is it the intent of this PEP to support such data structures,
 and allow the user to fill in a Unicode object, and then the
 processing is automatic? (i.e. in ID3v1, the string gets
 automatically Latin-1-encoded and zero-padded, in ID3v2, it
 gets automatically UTF-8 encoded, and null-terminated)


No, the point of the data-format object is to communicate information 
about data-formats not to encode or decode anything.   Users of the 
data-format object could decide what they wanted to do with that 
information.   We just need a standard way to communicate it through the 
buffer protocol.

-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-28 Thread Travis E. Oliphant
Martin v. Löwis wrote:
 Travis E. Oliphant schrieb:
 The datatype is an object that specifies how a certain block of
 memory should be interpreted as a basic data-type. 

datatype(float)
   datatype('float64')
 
 I can't speak on the specific merits of this proposal, or whether this
 kind of functionality is desirable. However, I'm -1 on the addition of
 a builtin for this functionality (the PEP doesn't actually say that
 there is another builtin, but the examples suggest so).

I was intentionally vague.  I don't see a need for it to be a built-in, 
but didn't know where exactly to put it,  I should have made it a 
question for discussion.

-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-28 Thread Travis E. Oliphant
Greg Ewing wrote:
 Travis E. Oliphant wrote:
 PEP: unassigned
 Title: Adding data-type objects to the standard library
 
 Not sure about having 3 different ways to specify
 the structure -- it smacks of Too Many Ways To Do
 It to me.

You might be right, but they all have use-cases.  I've actually removed 
most of the multiple ways that NumPy allows for creating data-types.

 
 Also, what if I want to refer to fields by name
 but don't want to have to work out all the offsets

I don't know what you mean.   You just use the list-style to define a 
data-format with fields.  The offsets are worked out for you.   The only 
use for offsets was the dictionary form.  The dictionary form stems from 
a desire to use the fields dictionary of a data-type as a data-type 
specification (which it is essentially is).

-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-28 Thread Travis E. Oliphant
M.-A. Lemburg wrote:
 Travis E. Oliphant wrote:
 M.-A. Lemburg wrote:
 Travis E. Oliphant wrote:
 

 PEP: unassigned
 Title: Adding data-type objects to the standard library
   Attributes

  kind  --  returns the basic kind of the data-type. The basic 
 kinds
  are:
't' - bit, 
'b' - bool, 
'i' - signed integer, 
'u' - unsigned integer,
'f' - floating point,  
'c' - complex floating point, 
'S' - string (fixed-length sequence of char),
'U' - fixed length sequence of UCS4,
 Shouldn't this read fixed length sequence of Unicode ?!
 The underlying code unit format (UCS2 and UCS4) depends on the
 Python version.
 Well, in NumPy 'U' always means UCS4.  So, I just copied that over.  See 
 my questions at the bottom which talk about how to handle this.  A 
 data-format does not necessarily have to correspond to something Python 
 represents with an Object.
 
 Ok, but why are you being specific about UCS4 (which is an internal
 storage format), while you are not specific about e.g. the
 internal bit size of the integers (which could be 32 or 64 bit) ?
 

The 'kind' does not specify how big the data-type (data-format) is.  A 
number is needed to represent the number of bytes.

In this case, the 'kind' does not specify how large the data-type is. 
You can have 'u1', 'u2', 'u4', etc.

The same is true with Unicode.  You can have 10-character unicode 
elements, 20-character, etc.  But, we have to be clear about what a 
character is in the data-format.

-Travis




___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-28 Thread Travis E. Oliphant
Armin Rigo wrote:
 Hi Travis,
 
 On Fri, Oct 27, 2006 at 02:05:31PM -0600, Travis E. Oliphant wrote:
 This PEP proposes adapting the data-type objects from NumPy for
 inclusion in standard Python, to provide a consistent and standard
 way to discuss the format of binary data. 
 
 How does this compare with ctypes?  Do we really need yet another,
 incompatible way to describe C-like data structures in the standard
 library?

Part of what the data-type, data-format object is trying to do is bring 
together all the disparate ways to represent data that *already* exists 
in the standard library.

What is needed is a definitive way to describe data and then have

array
struct
ctypes

all be compatible with that same method.That's why I'm proposing the 
PEP.  It's a unification effort not yet-another-method.  One of the big 
reasons for it is to move something like the array interface into 
Python.  There are tens to hundreds of people mostly in the scientific 
computing community that want to see Python grow more support for 
NumPy-like things.  I keep getting requests to do something to make 
Python more aware of arrays.   This PEP is part of that effort.

In particular, something like the array interface should be available in 
Python.  The easiest way to do this is to extend the buffer protocol to 
allow objects to share information about shape, strides, and data-format 
of a block of memory.

But, how do you represent data-format in Python?  What will the objects 
pass back and forth to each other to do it?  C-types has a solution 
which creates multiple objects to do it.  This is an un-wieldy 
over-complicated solution for the array interface.

The array objects have a solution using the a single object that carries 
the data-format information. The solution we have for arrays deserves 
consideration.  It could be placed inside the array module if desired, 
but again, I'm really looking for something that would allow the extend 
buffer protocol (to be proposed soon) to share data-type information.

That could be done with the array-interface objects (strings, lists, and 
tuples), but then every body who uses the interface will have to write 
their own decoders to process the data-format information.

I actually think ctypes would benefit from this data-format 
specification too.

Recognizing all these diverging ways to essentially talk about the same 
thing is part of what prompted this PEP.


-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-28 Thread Travis E. Oliphant
Martin v. Löwis wrote:
 Travis E. Oliphant schrieb:
 In this case, the 'kind' does not specify how large the data-type is. 
 You can have 'u1', 'u2', 'u4', etc.

 The same is true with Unicode.  You can have 10-character unicode 
 elements, 20-character, etc.  But, we have to be clear about what a 
 character is in the data-format.
 
 That is certainly confusing. In u1, u2, u4, the digit seems to indicate
 the size of a single value (1 byte, 2 bytes, 4 bytes). Right? Yet,
 in U20, it does *not* indicate the size of a single value but of an
 array? And then, it's not the size, but the number of elements?
 

Good point.  In NumPy, unicode support was added in parallel with 
string arrays where there is not the ambiguity.   So, yes, it's true 
that the unicode case is a special-case.

The other way to handle it would be to describe the 'code'-point size 
(i.e. 'U1', 'U2', 'U4' for UCS-1, UCS-2, UCS-4) and then have the length 
be encoded as an array of those types.

This was not the direction we took with NumPy (which is what I'm using 
as a reference) because I wanted Unicode and string arrays to look the 
same and thought of strings differently.

How to handle unicode data-formats could definitely be improved. 
Suggestions are welcome.


-Travis



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] PEP: Adding data-type objects to Python

2006-10-27 Thread Travis E. Oliphant


PEP: unassigned
Title: Adding data-type objects to the standard library
Version: $Revision: $
Last-Modified: $Date:  $
Author: Travis Oliphant [EMAIL PROTECTED]
Status: Draft
Type: Standards Track
Created: 05-Sep-2006
Python-Version: 2.6

Abstract

This PEP proposes adapting the data-type objects from NumPy for
inclusion in standard Python, to provide a consistent and standard
way to discuss the format of binary data. 

Rationale

There are many situations crossing multiple areas where an
interpretation is needed of binary data in terms of fundamental
data-types such as integers, floating-point, and complex
floating-point values.  Having a common object that carries
information about binary data would be beneficial to many
people. The creation of data-type objects in NumPy to carry the
load of describing what each element of the array contains
represents an evolution of a solution that began with the
PyArray_Descr structure in Python's own array object.  These
data-type objects can represent arbitrary byte data.  Currently
such information is usually constructed using strings and
character codes which is unwieldy when a data-type consists of
nested structures.

Proposal

Add a PyDatatypeObject in Python (adapted from NumPy's dtype
object which evolved from the PyArray_Descr structure in Python's
array module) that holds information about a data-type.  This object
will allow packages to exchange information about binary data in
a uniform way (see the extended buffer protocol PEP for an application
to exchanging information about array data). 

Specification

The datatype is an object that specifies how a certain block of
memory should be interpreted as a basic data-type. In addition to
being able to describe basic data-types, the data-type object can
describe a data-type that is itself an array of other data-types
as well as a data-type that contains arbitrary fields (structure
members) which are located at specific offsets. In its most basic
form, however, a data-type is of a particular kind (bit, bool,
int, uint, float, complex, object, string, unicode, void) and size.

Datatype objects can be created using either a type-object, a
string, a tuple, a list, or a dictionary according to the following
constructors:

Type-object: 

  For a select set of type-objects a data-type object describing that
  basic type can be described:

  Examples: 

   datatype(float)
  datatype('float64')
  
   datatype(int)
  datatype('int32')  # on 32-bit platform (64 if c-long is 64-bits)

Tuple-object
   
  A tuple of length 2 can be used to specify a data-type that is
  an array of another kind of basic data-type (this array always
  describes a C-contiguous array).

  Examples: 

   datatype((int, 5))
  datatype(('int32', (5,)))
  # describes a 5*4=20-byte block of memory laid out as 
  #  a[0], a[1], a[2], a[3], a[4]

   datatype((float, (3,2))
  datatype(('float64', (3,2))   
  # describes a 3*2*8=48 byte block of memory that should be
  # interpreted as 6 doubles laid out as arr[0,0], arr[0,1],
  # ... a[2,0], a[1,2]


String-object:
 
  The basic format is '%s%s%s%d' % (endian, shape, kind, itemsize) 

 kind : one of the basic array kinds given below. 
 
 itemsize : the nubmer of bytes (or bits for 't' kind) for 
 this data-type.  

 endian   : either '', '=' (native), '|' (doesn't matter),
 '' (big-endian) or '' (little-endian).

 shape: either '', or a shape-tuple describing a data-type that
 is an array of the given shape.

  A string can also be a comma-separated sequence of basic
  formats. The result will be a data-type with default field
  names: 'f0', 'f1', ..., 'fn'.

  Examples: 

   datatype('u4')
  datatype('uint32')

   datatype('f4')
  datatype('float32')

   datatype('(3,2)f4')
  datatype(('float32', (3,2))

   datatype('(5,)i4, (3,2)f4, S5')
  datatype([('f0', 'i4', (5,)), ('f1', 'f4', (3, 2)), ('f2', '|S5')])


List-object:

  A list should be a list of tuples where each tuple describes a
  field. Each tuple should contain (name, datatype{, shape}) or
  ((meta-info, name), datatype{, shape}) in order to specify the
  data-type. 

  This list must fully specify the data-type (no memory holes). If
  would would like to return a data-type with memory holes where the
  compiler would place them, then pass the keyword align=1 to this
  construction.  This will result in un-named fields of Void kind of
  the correct size interspersed where needed.

  Examples: 

  datatype([( ([1,2],'coords'), 'f4', (3,6)), ('address', 'S30')])

  A data-type that could represent the 

Re: [Python-Dev] Fwd: [Python-checkins] r51236 - in python/trunk: Doc/api/abstract.tex Include/abstract.h Include/object.h Lib/test/test_index.py Misc/NEWS Modules/arraymodule.c Modules/mmapmodule.c M

2006-08-14 Thread Travis E. Oliphant
Travis E. Oliphant wrote:
 
 The idea is that the __index__() method should return an exact int or 
 exact long or this call will raise an error.  The restriction is present 
 to remove the possibility of infinite recursion (though I'm not sure 
 where that would occur exactly).
 

I just realized that returning a sub-class of Int and/or Long would not 
be a problem here.  The recursion problem that I was guarding against 
only arose when the possibility of returning any object with an 
.__index__() method was suggested.

Therefore, I think these exact int or exact long checks can and should 
be replaced with PyInt_Check() and PyLong_Check().

-Travis



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Fwd: [Python-checkins] r51236 - in python/trunk: Doc/api/abstract.tex Include/abstract.h Include/object.h Lib/test/test_index.py Misc/NEWS Modules/arraymodule.c Modules/mmapmodule.c M

2006-08-12 Thread Travis E. Oliphant
Neal Norwitz wrote:
 I checked in this fix for the __index__ clipping issue that's been
 discussed.  This patch is an improvement, but still needs some work.
 

 +/* Return a Python Int or Long from the object item
 +   Raise TypeError if the result is not an int-or-long
 +   or if the object cannot be interpreted as an index.
 +*/
 +PyObject *
  PyNumber_Index(PyObject *item)
  {
 -   Py_ssize_t value = -1;
 -   PyNumberMethods *nb = item-ob_type-tp_as_number;
 -   if (nb != NULL  HASINDEX(item)  nb-nb_index != NULL) {
 -   value = nb-nb_index(item);
 +   PyObject *result = NULL;
 +   if (item == NULL)
 +   return null_error();
 +   /* XXX(nnorwitz): should these be CheckExact?  Aren't subclasses ok? 
 */

The idea is that the __index__() method should return an exact int or 
exact long or this call will raise an error.  The restriction is present 
to remove the possibility of infinite recursion (though I'm not sure 
where that would occur exactly).


 Modified: python/trunk/Python/ceval.c
 ==
 --- python/trunk/Python/ceval.c (original)
 +++ python/trunk/Python/ceval.c Sat Aug 12 19:03:09 2006
 @@ -3866,12 +3866,14 @@
 if (v != NULL) {
 Py_ssize_t x;
 if (PyInt_Check(v)) {
 -   x = PyInt_AsSsize_t(v);
 +   /* XXX(nnorwitz): I think PyInt_AS_LONG is correct,
 +  however, it looks like it should be AsSsize_t.
 +  There should be a comment here explaining why.
 +   */
 +   x = PyInt_AS_LONG(v);

Right now throughout the Python code it is assumed that 
sizeof(Py_ssize_t) = sizeof(long).  Because this code is an 
optimization for integers (or their sub-classes), it seems prudent to 
truly make it fast rather than make a function call that will just go 
through a series of checks to eventually make this very same call.


-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Fwd: [Python-checkins] r51236 - in python/trunk: Doc/api/abstract.tex Include/abstract.h Include/object.h Lib/test/test_index.py Misc/NEWS Modules/arraymodule.c Modules/mmapmodule.c M

2006-08-12 Thread Travis E. Oliphant
Neal Norwitz wrote:
 I checked in this fix for the __index__ clipping issue that's been
 discussed.  This patch is an improvement, but still needs some work.
 Please pull out any parts you have an issue with and suggest a patch
 to address your concern.
 

For me the only remaining concern is that quite often in the code we do this

if (PyIndex_Check(obj)) {
...
key = PyNumber_Index(obj);
or
key_value = PyNumber_AsSize_t(obj, ...)
}
else {remaining checks}


Internally PyNumber_AsSize_t makes a call to PyNumber_Index, and 
PyNumber_Index also calls the PyIndex_Check as well .  So, basically we 
end up calling PyIndex_Check(obj) 2 times when only one check should be 
necessary.

This code could be re-written to move any other type checks first and 
replace the PyIndex_Check(obj) code with PyNumber_Index(obj) and error 
handling but I'm not sure if that's the right way to go or if it's worth 
it.

-Travis Oliphant



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Fwd: [Python-checkins] r51236 - in python/trunk: Doc/api/abstract.tex Include/abstract.h Include/object.h Lib/test/test_index.py Misc/NEWS Modules/arraymodule.c Modules/mmapmodule.c M

2006-08-12 Thread Travis E. Oliphant
Tim Peters wrote:
 [Travis E. Oliphant]
 Right now throughout the Python code it is assumed that
 sizeof(Py_ssize_t) = sizeof(long).
 
 If you find any code like that, it's a bug.  Win64 is a platform where
 it's false.

Sorry for my confusion.  I meant to say that it is assumed that

sizeof(long) = sizeof(Py_ssize_t)

because the assumption is that a Python Int (stored as long) will 
*always* fit into  into a Py_ssize_t.

I think this is true on all platforms.

-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] __index__ clipping

2006-08-11 Thread Travis E. Oliphant
Travis E. Oliphant wrote:
 Here is my C-API proposal
 
 1) PyIndex_Check(obj)
 
Similar to PyIter_Check(obj) as it just checks for whether or not the 
 object can call nb_index.  Actually implemented as a macro.
 
 2) PyObject* PyNumber_Index(obj)
 
Simple interface around nb_index that calls it if possible and returns
TypeError if not (or if the result does not end up in an exact
int-or-long
 
 3) Py_ssize_t PyNumber_AsSsize_t(obj, err)
 
 This converts the object to a Py_ssize_t by using the nb_index call
 if possible.  The err argument is a Python error object and specifies
 what error should be raised should the conversion from Int-or-Long to
 Py_ssize_t cause Overflow.
 
 If err is NULL, then no error will be raised and the result will be 
 clipped.  Other-wise the corresponding error will be set.  Internally 
 PyExc_IndexError and PyExc_OverflowError will be the errors used.
 

This proposal is implemented as patch 1538606 
http://sourceforge.net/tracker/index.php?func=detailaid=1538606group_id=5470atid=305470


-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] __index__ clipping

2006-08-10 Thread Travis E. Oliphant
Guido van Rossum wrote:
 On 8/10/06, Nick Coghlan [EMAIL PROTECTED] wrote:
 Guido van Rossum wrote:
 It seems like Nick's recent patches solved the problems that were
 identified.
 Nick, can you summarize how your patches differ from my proposal?
 nb_index and __index__ are essentially exactly as you propose.
 
 Then I don't understand why Travis is objecting against my proposal!

I must have missed his most recent patch that changed the result to 
return a Python object.  I thought earlier versions didn't do that.

My objection is not particularly solid.  At this point it's largely a 
wish to avoid the extra overhead (there are some camps that already 
complain about NumPy having too much indexing overhead --- although they 
should be using Python lists for there purposes if indexing overhead 
really is a problem).

As it appears that several people feel this is the best way forward, 
then I'll re-work my NumPy code.   I still appreciate the change that 
allows other Python objects to be integers and eliminates the only true 
integers allowed flavor of several places in the Python code.

Thanks for all your hard work.

-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] __index__ clipping

2006-08-09 Thread Travis E. Oliphant
Guido van Rossum wrote:
 Here's another issue where Neal thought it would be useful if I
 weighed in. I'm not quite sure of the current status, but perhaps the
 following would work?
 
 - Called from Python, (10**10).__index__() should return 100L,
 not raise an exception or return sys.maxint.
 
 - The nb_index slot is changed to return an object; it should return a
 Python int or long without clipping (same as __index__() called from
 Python).

I don't like this approach because nb_int and nb_long are already 
available to convert an object into an int or a long, so the only value 
of nb_index is to make sure that floats aren't able to be converted this 
way.

The value of nb_index is that it provides a small-overhead means to 
quickly convert any object that allows it into a Py_ssize_t value which 
is needed in many internal situations (mostly indexing related).

Rather than assume a Python integer is the only candidate for conversion 
to a Py_ssize_t value,  nb_index allows other objects to be used in that 
fashion.  Having another level of indirection where one object is first 
converted to a Python int or Python long (whichever is needed) and then 
to a Py_ssize_t value seems like un-necessary waste to me and will 
needlessly slow certain indexing operations down.

It seems like Nick's recent patches solved the problems that were 
identified.

-Travis













 
 - There should be three C APIs (perhaps fewer if some of them are never 
 needed):
 
   - One to call the nb_index slot, or raise an exception if it's not
 set, returning the object
 
   - One to return the clipped value without an exception
 (can still return an exception if something else went wrong)
 
   - One to return the value if it fits, raise OverflowError if it doesn't
 
 I know this is quite the deviation from the current code but it seems
 the most rational and future-proof approach.
 

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] segfault when using PyGILState_Ensure/Release in Python2.3.4

2006-07-21 Thread Travis E. Oliphant

I'm hoping somebody here can help me with an error I'm getting in Python 
  2.3.4 but not in Python 2.4.2 when I use PyGILState_Ensure in NumPy on 
Linux.

Perhaps somebody can point out what I'm doing wrong because while I've 
tried to understand the threading API it can be a bit confusing and 
maybe I'm doing it wrong.

Right now, I've had to disable threading support in NumPy for Python 2.3 
which is a bit annoying.

The problem shows up when I've released the GIL using 
PyEval_SaveThread() in one section of code.  Then the code calls 
functions that don't involve the Python C-API.

Then another function sometimes requires use of the C-API to set a 
Python Error or issue a warning.   So I call:

_save = PyGILState_Ensure();

Use Python C-API to issue an error or warning

Finally, before exiting this function

PyGILState_Release(_save);

is called.   Later when control returns to the original caller that 
released the GIL, PyEval_RestoreThread() is called.  But the segfault 
seems to be happening on the call to PyGILState_Release(_save);

All of this works fine when it runs under Python 2.4.2, but under Python 
2.3.4 I get a segfault.

Does anybody have any ideas?   Thanks very much.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Possible bug in complexobject.c (still in Python 2.5)

2006-05-31 Thread Travis E. Oliphant

I'm curious about the difference between

float_subtype_new  in floatobject.c
complex_subtype_from_c_complex in complexobject.c

The former uses type-tp_alloc(type, 0) to create memory for the object 
while the latter uses PyType_GenericAlloc(type, 0) to create memory for 
the sub-type (thereby by-passing the sub-type's own memory allocator).

It seems like this is a bug.   Shouldn't type-tp_alloc(type, 0) also be 
used in the case of allocating complex subtypes?

This is causing problems in NumPy because we have a complex type that is 
a sub-type of the Python complex scalar.  It sometimes uses the 
complex_new code to generate instances (so that the __new__ interface is 
the same),  but because complex_subtype_from_c_complex is using 
PyType_GenericAlloc this is causing errors.

I can work around this by not calling the __new__ method for the base 
type but this is not consistent.


Thanks for any feedback,

-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Possible bug in complexobject.c (still in Python 2.5)

2006-05-31 Thread Travis E. Oliphant
I'm curious about the difference between

float_subtype_new  in floatobject.c
complex_subtype_from_c_complex in complexobject.c

The former uses type-tp_alloc(type, 0) to create memory for the object
while the latter uses PyType_GenericAlloc(type, 0) to create memory for
the sub-type (thereby by-passing the sub-type's own memory allocator).

It seems like this is a bug.   Shouldn't type-tp_alloc(type, 0) also be
used in the case of allocating complex subtypes?

This is causing problems in NumPy because we have a complex type that is
a sub-type of the Python complex scalar.  It sometimes uses the
complex_new code to generate instances (so that the __new__ interface is
the same),  but because complex_subtype_from_c_complex is using
PyType_GenericAlloc this is causing errors.

I can work around this by not calling the __new__ method for the base
type but this is not consistent.


Thanks for any feedback,

P.S.  Sorry about the cross-posting to another thread.  I must have hit 
reply instead of compose.  Please forgive the noise.


-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Possible bug in complexobject.c (still in Python 2.5)

2006-05-31 Thread Travis E. Oliphant

Guido van Rossum wrote:

I wouldn't be surprised if this is a genuine bug; the complex type
doesn't get a lot of love from core developers.

Could you come up with a proposed fix, and a unit test showing that it
works (and that the old version doesn't)? (Maybe a good unit test
would require writing a custome C extension; in that case just show
some sample code.)



Attached is a sample module that exposes the problem.  The problem goes 
away by replacing


op = PyType_GenericAlloc(type, 0);

with

op = type-tp_alloc(type, 0);

in the function

complex_subtype_from_c_complex

in the file complexobject.c  (about line #191).



The problem with a unit test is that it might not fail.  On my Linux 
system, it doesn't complain about the problem unless I first use strict 
pointer checking with


export MALLOC_CHECK_=2

Then the code

import bugtest
a = bugtest.newcomplex(3.0)
del a

Aborts

Valgrind also shows the error when running the simple code. It seems 
pretty clear to me that the subtype code should be calling the sub-types 
tp_alloc code instead of the generic one.



Best regards,

-Travis











#include Python.h


typedef struct {
	PyObject_HEAD
	double real;
	double imag;
} PyNewComplexObject;


static PyTypeObject PyComplex_SubType = { 
PyObject_HEAD_INIT(NULL)
0,   /*ob_size*/
newcomplex,		   /*tp_name*/
sizeof(PyNewComplexObject),/*tp_basicsize*/
};

static PyObject *
_complex_alloc(PyTypeObject *type, int nitems)
{
	PyObject *obj;

	obj = (PyObject *)malloc(_PyObject_SIZE(type));
	memset(obj, 0, _PyObject_SIZE(type));
	PyObject_INIT(obj, type);
	return obj;
}

PyMODINIT_FUNC initbugtest(void) {
	PyObject *m, *d;
	m = Py_InitModule(bugtest, NULL);
	d = PyModule_GetDict(m);

	PyComplex_SubType.tp_free = free;
	PyComplex_SubType.tp_alloc = _complex_alloc;
	PyComplex_SubType.tp_base = PyComplex_Type;	
	PyComplex_SubType.tp_flags = Py_TPFLAGS_DEFAULT | Py_TPFLAGS_CHECKTYPES;

PyType_Ready(PyComplex_SubType);
	Py_INCREF(PyComplex_SubType);
	PyDict_SetItemString(d, newcomplex, 
			 (PyObject *)PyComplex_SubType);
	return;
}
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Possible bug in complexobject.c (still in Python 2.5)

2006-05-31 Thread Travis E. Oliphant
Travis E. Oliphant wrote:
 I'm curious about the difference between
 
 float_subtype_new  in floatobject.c
 complex_subtype_from_c_complex in complexobject.c
 
 The former uses type-tp_alloc(type, 0) to create memory for the object
 while the latter uses PyType_GenericAlloc(type, 0) to create memory for
 the sub-type (thereby by-passing the sub-type's own memory allocator).
 
 It seems like this is a bug.   Shouldn't type-tp_alloc(type, 0) also be
 used in the case of allocating complex subtypes?

I submitted an entry and a patch for this on SourceForge Tracker (#1498638)

http://sourceforge.net/tracker/index.php?func=detailaid=1498638group_id=5470atid=105470


-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] adding Construct to the standard library?

2006-04-24 Thread Travis E. Oliphant
Greg Ewing wrote:
 Travis Oliphant wrote:
 
 For what it's worth,  NumPy also defines a data-type object which it 
 uses to describe the fundamental data-type of an array.  In the context 
 of this thread it is also yet another way to describe a binary-packed 
 structure in Python.
 
 Maybe there should be a separate module providing
 a data-packing facility that ctypes, NumPy, etc.
 can all use (perhaps with their own domain-specific
 extensions to it).
 
 It does seem rather silly to have about 3 or 4
 different incompatible ways to do almost exactly
 the same thing (struct, ctypes, NumPy and now
 Construct).

I agree.  Especially with ctypes and struct now in the standard library. 
  The problem, however, is that every module does something a little-bit 
different with the object.   NumPy needs a built-in object with at least 
a few fields defined.

The idea of specifying the data-type is different then it's 
representation to NumPy.

After looking at it, I'm not particularly fond of Construct's way to 
specify data-types, but then again we've been developing the array 
interface for just this purpose and so have some biased opinions.

Some kind of data-type specification would indeed be useful.  NumPy 
needs a built-in (i.e. written in C) data-type object internally.  If 
that builtin object were suitable generally then all the better.

For details, look at http://numeric.scipy.org/array_interface  (in 
particular the __array_descr__ field of the interface for what we came 
up with last year over several months of discussion.


-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Expose the array interface in Python 2.5?

2006-03-18 Thread Travis E. Oliphant
Paul Moore wrote:
 On 3/17/06, Thomas Heller [EMAIL PROTECTED] wrote:
 
 Travis E. Oliphant wrote:
 Would it be possible to add at least the C-struct array interface to the
 Python arrayobject in time for Python 2.5?
 I'm very much for that.

 Is someone on this list willing to help make it happen?
 Unfortunately not me - I'm too busy with ctypes, and if the array interface
 makes it into the core I will have to implement/use that in ctypes too.

The ctypes interface is also another reason I really want a basic array 
interface into the core.

As the very clever people on this list noticed, I was actually asking 
two different things because I was really looking for support from 
people who wanted either one...

What I would really like to see is a very-simple C-object defined 
(perhaps not even exposed to the Python user but only the 
extension-module writer).

This C-object would have the structure of NumPy arrays, be inheritable, 
and define an __array_struct__ method (notice I'm not even asking at 
this point for a new function-pointer table like the buffer protocol has 
--- although I think at some point that could be useful and maybe it is 
even better right now to push for that).

The purpose of the C-object would be so that all extension writers to 
Python can rely on a simple but general-purpose description of an array 
that Numeric has established over the past decade.

This C-object already has been somewhat fleshed out and the only real 
work would be to add support for it on other objects.

 If all that is required is a simple C definition added to the Python
 code, then it seems to me that would be (a) fairly easy, but (b)
 pretty useless (on the basis that a standards-style PEP like the DB
 API ones would do much the same, without any code changes).

I think I would like to see a real object that can be inherited from (at 
least in C extension modules).  Again perhaps in Python 2.5, this object 
is not even intentionally exposed to the Python level.

 
 On the third hand, the PEP referenced from the page you quote (at
 http://svn.scipy.org/svn/PEP/PEP_genarray.txt) seems to specify an
 implementation - are you just asking for that to be added to the core?
 If so, why not submit this as a formal PEP, and see what happens from
 there? (For this route, one of the Numeric people *needs* to champion
 the PEP, IMHO, as only they are qualified to address any points that
 get raised in general discussion).


The big problem is the release schedule of Python 2.5,  I've really 
wanted to get something along these lines going for Python 2.5, but 
I've not had the time to finish the PEP.  So, perhaps I'll tone down the 
request to a simple C-object with even less defined and put that out.

Frankly, I'm really looking for help from an avid Python-dev person who
may have an interest but perhaps not much experience with arrays.

-Travis


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Expose the array interface in Python 2.5?

2006-03-17 Thread Travis E. Oliphant

Last year, during development of NumPy, the concept of the array 
interface was developed as a means to share array_like data.  It was 
realized that it is this concept that needs to be explored further and 
pushed into Python itself, rather than a particular incarnation of an 
array.

It is very important for many people to get access to memory with some 
description of how that memory should be interpreted as an array. 
Several Python packages could benefit if Python had some notion of an 
array interface that was at least supported in a duck-typing fashion.

The description of what we've come up with so far and is implemented in 
NumPy (and Numarray and last released Numeric) is at

http://numeric.scipy.org/#array_interface

Quite a few of us would love to see this get into Python proper, but 
have very little free-time to spare to make sure it happens.

Would it be possible to add at least the C-struct array interface to the 
Python arrayobject in time for Python 2.5?

Is someone on this list willing to help make it happen?

In NumPy, there is also a reasonably good way to describe the 
data-type of arbitrary data, that fell out of the discussions over the 
array interface.  I think something like this could eventually find its 
way into Python as well.

We would love any feedback from the Python community on the array 
interface.  Especially because we'd like to see it in Python itself and 
supported and used by every relevant Python package sooner rather than 
later.

Thanks,

-Travis Oliphant





___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Expose the array interface in Python 2.5?

2006-03-17 Thread Travis E. Oliphant
Nick Coghlan wrote:
 Travis E. Oliphant wrote:
 Would it be possible to add at least the C-struct array interface to the 
 Python arrayobject in time for Python 2.5?
 
 Do you mean simply adding an __array_shape__ attribute that consists of a 
 tuple with the array length, and an __array_type__ attribute set to 'O'?
 
 Or trying to expose the array object's data?

I was thinking more the __array_struct__ (in particular the C-structure 
that defines it).

 
 The former seems fairly pointless, and the latter difficult (since it has 
 implications for moving the data store when the array gets resized).

Sure, it's the same problem as exposing through the buffer protocol. 
Since, we already have that problem, why try to pretend we don't?
 
 I've spent a fair bit of time looking at this interface, and while I'm a big
 fan of the basic idea, I'm not convinced that it makes sense to
 include the interface in the core without *also* adopting a common convention
 for multi-dimensional fixed shape indexing (e.g. by introducing a simple
 dimensioned array type as something like array.dimarray).

True, such a thing would be great, but it could also be written in 
Python fairly quickly building on top of the array and serve as a simple 
example.

My big quest is to get PIL, PyVox, WxPython, PyOpenGL, and so forth to 
be able to use the same interface.  Blessing the interface by including 
it in the Python core would help.  I'm also just wanting people in 
py-dev to get the concept of an array interface on their radar, as 
discussions of new bytes types emerges.

Sometimes, there is not enough cross-talk between numpy-discussions and 
pydev.  This is our fault, of course, but we're often swamped (I know I 
am...), and it can take some effort for us array people to figure out 
what's going on in the depths of Python sufficiently to comprehend some 
of the discussions here.

 
 The fact that array.array is a mutable sequence rather than a fixed shape
 array means that it doesn't mesh particularly well with the ideas behind the 
 array interface. numpy arrays can have their shape changed via reshape, but 
 they impose the rule that the total number of elements can't change so that 
 the allocated memory doesn't need to be moved - the standard library's array 
 type has no such limitation.

This is not really a limitation of numpy arrays either.  Check the 
resize method...  But, I understand your point that array.array's are 
more-like lists.  Of course, when they behave that way, their buffer 
interface is presently broken.   So, maybe the array.array is 
sufficiently broken to not be worth fixing, but what else should be done?

I'm kind of tired of this problem dragging on and on.  The Numeric 
header (essentially what the __array_struct__ exposes) is now basically 
unchanged for over 10 years and yet it's direct support by Python is 
still not their.   The Python community has been very helpful over the 
years, but we need more direct discussion with Python developers to help 
things along.  I'm grateful Nick has responded.  If anyone else has any 
interest in these ideas, please sound off.

 
 Aside from the obvious (the use of Ellipsis and permitting multiple
 dimensions), there are a number of ways in which the semantics of numpy array
 subscripts differ from normal sequence subcripts, and which of these should be
 part of the common multi-dimensional indexing conventions needs to be thrashed
 out in a PEP:

While these are interesting academic issues. The problem with most of 
these comments is that you will get load voices of disapproval if any of 
these conventions changes significantly from what has become standard 
via Numeric's use over 10 years.

I think no one is up to the task of trying to re-concile Numeric 
behavior with Python-dev opinions of what 'ought' to be, unless the 
basic usage does not change too much.

 
- numpy array slices are views that permit mutation of the original object
  (slicing a sequence creates a copy of the sliced section)

Not really open for discussion among Numeric Python users as it's been 
debated for years always coming to the same (keep the current behavior) 
conclusion.
 
- assignment to slices is not allowed to change the shape of a numpy array
  (assigning to a slice of a normal sequence may change the total length)

People might be open to this idea, as it adds a new feature and doesn't 
signficantly change other usages.

 
- deletion of slices is not permitted by numpy arrays
  (deleting a slice of a sequence changes the total length)


Also something people might accept.

- NewAxis is a novel use of subscript notation

True, but not something we can really change.

 
- there are sophisticated rules to try to align numpy array shapes

You are speaking of broadcasting.  These could of course be discussed, 
but current behavior is entrenched

 
- assignment of a sequence to a numpy array section is rather 
 disconcerting

[Python-Dev] Strange behavior in Python 2.5a0 (trunk) --- possible error in AST?

2006-03-13 Thread Travis E. Oliphant

I'm seeing strange behavior in the Python 2.5a0 trunk that is causing 
the tests for numpy to fail.  Apparently obj[...] = 1 is not calling 
PyObject_SetItem

Here is a minimal example to show the error.  Does anyone else see this?

class temp(object):
def __setitem__(self, obj, v):
print obj, v

mine = temp()
mine[...] = 1
mine[0] = 1


Output from Python 2.4.2:
Ellipsis 1
0 1


Output from Python 2.5a0:
0 1


In other words, the first line does nothing in Python 2.5a0.


Does anybody else see this?

Thanks,

-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Strange behavior in Python 2.5a0 (trunk) --- possible error in AST?

2006-03-13 Thread Travis E. Oliphant
Nick Coghlan wrote:
 
 And how...
 
case Ellipsis_kind:
  ADDOP_O(c, LOAD_CONST, Py_Ellipsis, consts)
  break;
 
 Just a couple of minor details missing, like, oh, compiling the actual 
 subscript operation :)
 
 Bug here: http://www.python.org/sf/1448804
 
 (assigned to myself, since I already wrote a test for it and worked out where 
 to fix it)

Fabulous!  The fix committed to SVN seems to work.

Now, all of numpy's unit tests are passing with Python 2.5a0.

Great work, thank you.

-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] ssize_t branch merged

2006-02-18 Thread Travis E. Oliphant
Martin v. Löwis wrote:
 Neal Norwitz wrote:
 
I suppose that might be nice, but would require configure magic.  I'm
not sure how it could be done on Windows.
 
 
 Contributions are welcome. On Windows, it can be hard-coded.
 
 Actually, something like
 
 #if SIZEOF_SIZE_T == SIZEOF_INT
 #define PY_SSIZE_T_MAX INT_MAX
 #elif SIZEOF_SIZE_T == SIZEOF_LONG
 #define PY_SSIZE_T_MAX LONG_MAX
 #else
 #error What is size_t equal to?
 #endif
 
 might work.


Why not just

#if SIZEOF_SIZE_T == 2
#define PY_SSIZE_T_MAX 0x7fff
#elif SIZEOF_SIZE_T == 4
#define PY_SSIZE_T_MAX 0x7fff
#elif SIZEOF_SIZE_T == 8
#define PY_SSIZE_T_MAX 0x7fff
#elif SIZEOF_SIZE_T == 16
#define PY_SSIZE_T_MAX 0x7fff
#endif

?

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] ssize_t branch merged

2006-02-17 Thread Travis E. Oliphant
Tim Peters wrote:
 [Travis Oliphant]
 
Maybe I have the wrong version of code.  In my pyport.h (checked out
from svn trunk) I have.

#define PY_SSIZE_T_MAX ((Py_ssize_t)(((size_t)-1)1))

What is size_t?
 
 
 size_t is an unsigned integral type defined by, required by, and used
 all over the place in standard C.  What exactly is the compiler
 message you get, and exactly which compiler are you using (note that
 nobody else is having problems with this, so there's something unique
 in your setup)?

I'm very sorry for my silliness.  I do see the problem I was having now. 
   Thank you for helping me out.  I was assuming that PY_SSIZE_T_MAX 
could be used in a  pre-processor statement like LONG_MAX and INT_MAX.

In other words

#if PY_SSIZE_T_MAX != INT_MAX

This was giving me errors and I tried to understand the #define 
statement as an arithmetic operation (not a type-casting one).  I did 
know about size_t but thought it strange that 1 was being subtracted 
from it.

I would have written this as (size_t)(-1) to avoid that confusion.  I do 
apologize for my error.  Thank you for taking the time to explain it.

I still think that PY_SSIZE_T_MAX ought to be usable in a pre-processor 
statement, but it's a nit.

Best,

-Travis



 
 No.  (size_t)-1 casts -1 to the unsigned integral type size_t,

That's what I was missing I saw this as subtraction not type-casting. 
My mistake

-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bytes type discussion

2006-02-14 Thread Travis E. Oliphant
Guido van Rossum wrote:
 I'm about to send 6 or 8 replies to various salient messages in the
 PEP 332 revival thread. That's probably a sign that there's still a
 lot to be sorted out. In the mean time, to save you reading through
 all those responses, here's a summary of where I believe I stand.
 Let's continue the discussion in this new thread unless there are
 specific hairs to be split in the other thread that aren't addressed
 below or by later posts.


I hope bytes objects will be pickle-able?  If so, and they support the 
buffer protocol, then many NumPy users will be very happy.

-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Please comment on PEP 357 -- adding nb_index slot to PyNumberMethods

2006-02-14 Thread Travis E. Oliphant
After some revisions, PEP 357 is ready for more comments.  Please voice 
any concerns.


-Travis
PEP: 357
Title: Allowing Any Object to be Used for Slicing
Version: $Revision: 42367 $
Last Modified: $Date: 2006-02-14 18:12:07 -0700 (Tue, 14 Feb 2006) $
Author: Travis Oliphant [EMAIL PROTECTED]
Status: Draft
Type: Standards Track
Created: 09-Feb-2006
Python-Version: 2.5

Abstract

This PEP proposes adding an nb_index slot in PyNumberMethods and an
__index__ special method so that arbitrary objects can be used
whenever only integers are called for in Python, such as in slice
syntax (from which the slot gets its name).

Rationale

Currently integers and long integers play a special role in
slicing in that they are the only objects allowed in slice
syntax. In other words, if X is an object implementing the
sequence protocol, then X[obj1:obj2] is only valid if obj1 and
obj2 are both integers or long integers.  There is no way for obj1
and obj2 to tell Python that they could be reasonably used as
indexes into a sequence.  This is an unnecessary limitation.

In NumPy, for example, there are 8 different integer scalars
corresponding to unsigned and signed integers of 8, 16, 32, and 64
bits.  These type-objects could reasonably be used as integers in
many places where Python expects true integers but cannot inherit from 
the Python integer type because of incompatible memory layouts.  
There should be some way to be able to tell Python that an object can 
behave like an integer.

It is not possible to use the nb_int (and __int__ special method)
for this purpose because that method is used to *coerce* objects
to integers.  It would be inappropriate to allow every object that
can be coerced to an integer to be used as an integer everywhere
Python expects a true integer.  For example, if __int__ were used
to convert an object to an integer in slicing, then float objects
would be allowed in slicing and x[3.2:5.8] would not raise an error
as it should.

Proposal
 
Add an nb_index slot to PyNumberMethods, and a corresponding
__index__ special method.  Objects could define a function to place
in the nb_index slot that returns an appropriate C-integer (Py_ssize_t
after PEP 353).  This C-integer will be used whenever Python needs
one such as in PySequence_GetSlice, PySequence_SetSlice, and
PySequence_DelSlice.  

Specification:

1) The nb_index slot will have the signature

   Py_ssize_t index_func (PyObject *self)

2) The __index__ special method will have the signature

   def __index__(self):
   return obj
   
   Where obj must be either an int or a long or another object that has the
   __index__ special method (but not self).

3) A new C-API function PyNumber_Index will be added with signature

   Py_ssize_t PyNumber_index (PyObject *obj)

   which will special-case integer and long integer objects but otherwise
   return obj-ob_type-tp_as_number-nb_index(obj) if it is available. 
   A -1 will be returned and an exception set on an error. 

4) A new operator.index(obj) function will be added that calls
   equivalent of obj.__index__() and raises an error if obj does not 
implement
   the special method.
   
Implementation Plan

1) Add the nb_index slot in object.h and modify typeobject.c to 
   create the __index__ method

2) Change the ISINT macro in ceval.c to ISINDEX and alter it to 
   accomodate objects with the index slot defined.

3) Change the _PyEval_SliceIndex function to accomodate objects
   with the index slot defined.

4) Change all builtin objects (e.g. lists) that use the as_mapping 
   slots for subscript access and use a special-check for integers to 
   check for the slot as well.

5) Add PyNumber_Index C-API to return an integer from any 
   Python Object that has the nb_index slot.  

6) Add the operator.index(x) function.


Possible Concerns

Speed: 

Implementation should not slow down Python because integers and long
integers used as indexes will complete in the same number of
instructions.  The only change will be that what used to generate
an error will now be acceptable.

Why not use nb_int which is already there?:

The nb_int method is used for coercion and so means something
fundamentally different than what is requested here.  This PEP
proposes a method for something that *can* already be thought of as
an integer communicate that information to Python when it needs an
integer.  The biggest example of why using nb_int would be a bad
thing is that float objects already define the nb_int method, but
float objects *should not* be used as indexes in a sequence.

Why the name __index__?:

Some questions were raised regarding the name __index__ when other
interpretations of the slot 

Re: [Python-Dev] PEP for adding an sq_index slot so that any object, a or b, can be used in X[a:b] notation

2006-02-09 Thread Travis E. Oliphant
Adam Olsen wrote:
 On 2/9/06, Travis Oliphant [EMAIL PROTECTED] wrote:
 
Guido seemed accepting to this idea about 9 months ago when I spoke to
him.  I finally got around to writing up the PEP.   I'd really like to
get this into Python 2.5 if possible.
 
 
 -1
 
 I've detailed my reasons here:
 http://mail.python.org/pipermail/python-dev/2006-January/059851.html
 
 In short, there are purely math usages that want to convert to int
 while raising exceptions from inexact results.  The name __index__
 seems inappropriate for this, and I feel it would be cleaner to fix
 float.__int__ to raise exceptions from inexact results (after a
 suitable warning period and with a trunc() function added to math.)


I'm a little confused.  Is your opposition solely due to the fact that 
you think float's __int__ method ought to raise exceptions and the 
apply_slice code should therefore use the __int__ slot?

In theory I can understand this reasoning.  In practice, however, the 
__int__ slot has been used for coercion and changing the semantics of 
int(3.2) at this stage seems like a recipe for lots of code breakage.  I 
don't think something like that is possible until Python 3k.

If that is not your opposition, please be more clear. Regardless of how 
it is done, it seems rather unPythonic to only allow 2 special types to 
be used in apply_slice and assign_slice.

-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP for adding an sq_index slot so that any object, a or b, can be used in X[a:b] notation

2006-02-09 Thread Travis E. Oliphant
Bengt Richter wrote:

 
 How about if SLICE byte code interpretation would try to call
 obj.__int__() if passed a non-(int,long) obj ? Would that cover your use case?


I believe that this is pretty much exactly what I'm proposing.  The 
apply_slice and assign_slice functions in ceval.c are called for the 
SLICE and STORE_SLICE and DELETE_SLICE opcodes.

 BTW the slice type happily accepts anything for start:stop:step I believe,
 and something[slice(whatever)] will call something.__getitem__ with the slice
 instance, though this is neither a fast nor nicely spelled way to customize.
 

Yes, the slice object itself takes whatever you want.  However, Python 
special-cases what happens for X[a:b] *if* X as the sequence-protocol 
defined.   This is the code-path I'm trying to enhance.

-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP for adding an sq_index slot so that any object, a or b, can be used in X[a:b] notation

2006-02-09 Thread Travis E. Oliphant
Bengt Richter wrote:

 
 How about if SLICE byte code interpretation would try to call
 obj.__int__() if passed a non-(int,long) obj ? Would that cover your use case?


I believe that this is pretty much what I'm proposing (except I'm not 
proposing to use the __int__ method because it is already used as 
coercion and doing this would allow floats to be used in slices which is 
a bad thing).  The apply_slice and assign_slice functions in ceval.c are 
called for the SLICE and STORE_SLICE and DELETE_SLICE opcodes.

 BTW the slice type happily accepts anything for start:stop:step I believe,
 and something[slice(whatever)] will call something.__getitem__ with the slice
 instance, though this is neither a fast nor nicely spelled way to customize.
 

Yes, the slice object itself takes whatever you want.  However, Python 
special-cases what happens for X[a:b] *if* X as the sequence-protocol 
defined.   This is the code-path I'm trying to enhance.

-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP for adding an sq_index slot so that any object, a or b, can be used in X[a:b] notation

2006-02-09 Thread Travis E. Oliphant


Attached is an updated PEP for 357.  I think the index concept is better 
situated in the PyNumberMethods structure.  That way an object doesn't 
have to define the Sequence protocol just to be treated like an index.


-Travis
PEP: 357357357 
Title:  Allowing any object to be used for slicing
Version:  Revision 1.2
Last Modified: 02/09/2006
Author: Travis Oliphant oliphant at ee.byu.edu
Status: Draft
Type:  Standards Track
Created:  09-Feb-2006
Python-Version:  2.5

Abstract

   This PEP proposes adding an nb_as_index slot in PyNumberMethods and
   an __index__ special method so that arbitrary objects can be used
   in slice syntax.

Rationale

   Currently integers and long integers play a special role in slice
   notation in that they are the only objects allowed in slice
   syntax. In other words, if X is an object implementing the sequence
   protocol, then X[obj1:obj2] is only valid if obj1 and obj2 are both
   integers or long integers.  There is no way for obj1 and obj2 to
   tell Python that they could be reasonably used as indexes into a
   sequence.  This is an unnecessary limitation.  

   In NumPy, for example, there are 8 different integer scalars
   corresponding to unsigned and signed integers of 8, 16, 32, and 64
   bits.  These type-objects could reasonably be used as indexes into
   a sequence if there were some way for their typeobjects to tell
   Python what integer value to use.  

Proposal
 
   Add a nb_index slot to PyNumberMethods, and a corresponding
   __index__ special method.  Objects could define a function to
   place in the sq_index slot that returns an appropriate
   C-integer for use as ilow or ihigh in PySequence_GetSlice, 
   PySequence_SetSlice, and PySequence_DelSlice.

Implementation Plan

   1) Add the slots

   2) Change the ISINT macro in ceval.c to ISINDEX and alter it to 
  accomodate objects with the index slot defined.

   3) Change the _PyEval_SliceIndex function to accomodate objects
  with the index slot defined.

Possible Concerns

   Speed: 

   Implementation should not slow down Python because integers and long
   integers used as indexes will complete in the same number of
   instructions.  The only change will be that what used to generate
   an error will now be acceptable.

   Why not use nb_int which is already there?

   The nb_int, nb_oct, and nb_hex methods are used for coercion.
   Floats have these methods defined and floats should not be used in
   slice notation.

Reference Implementation

   Available on PEP acceptance.

Copyright

   This document is placed in the public domain

 

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP for adding an sq_index slot so that any object, a or b, can be used in X[a:b] notation

2006-02-09 Thread Travis E. Oliphant
Guido van Rossum wrote:
 On 2/9/06, Travis Oliphant [EMAIL PROTECTED] wrote:
 
 
 BTW do you also still want to turn ZeroDivisionError into a warning
 (that is changed into an error by default)? That idea shared a slide
 and I believe it was discussed in the same meeting you  I and some
 others had in San Mateo last summer.

I think that idea has some support, but I haven't been thinking about it 
for awhile.

 
 
 Shouldn't this slot be in the PyNumberMethods extension? It feels more
 like a property of numbers than of a property of sequences. Also, the
 slot name should then probably be nb_index.

Yes, definitely!!!

 
 There's also an ambiguity when using simple indexing. When writing
 x[i] where x is a sequence and i an object that isn't int or long but
 implements __index__, I think i.__index__() should be used rather than
 bailing out. I suspect that you didn't think of this because you've
 already special-cased this in your code -- when a non-integer is
 passed, the mapping API is used (mp_subscript). This is done to
 suppose extended slicing. The built-in sequences (list, str, unicode,
 tuple for sure, probably more) that implement mp_subscript should
 probe for nb_index before giving up. The generic code in
 PyObject_GetItem should also check for nb_index before giving up.
 

I agree.  These should also be changed. I'll change the PEP, too.
 
 I think all sequence objects that implement mp_subscript should
 probably be modified according to the lines I sketched above.


True.

 
 This is very close to acceptance. I think I'd like to see the patch
 developed and submitted to SF (and assigned to me) prior to
 acceptance.
 

O.K. I'll work on it.

-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP for adding an sq_index slot so that any object, a or b, can be used in X[a:b] notation

2006-02-09 Thread Travis E. Oliphant
Guido van Rossum wrote:
 On 2/9/06, Travis Oliphant [EMAIL PROTECTED] wrote:
 
 
 This is very close to acceptance. I think I'd like to see the patch
 developed and submitted to SF (and assigned to me) prior to
 acceptance.
 
 
Copyright

   This document is placed in the public domain
 
 
 If you agree with the above comments, please send me an updated
 version of the PEP and I'll check it in over the old one, and approve
 it. Then just use SF to submit the patch etc.
 

I uploaded a patch to SF against current SVN.  The altered code compiles 
and the functionality works with classes defined in Python.  I have yet 
to test against a C-type that defines the method.

The patch adds a new API function int PyObject_AsIndex(obj).

This was not specifically in the PEP but probably should be.  The name 
could also be PyNumber_AsIndex(obj)  but I was following the nb_nonzero 
slot example to help write the code.

-Travis



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP for adding an sq_index slot so that any object, a or b, can be used in X[a:b] notation

2006-02-09 Thread Travis E. Oliphant
Thomas Wouters wrote:
 On Thu, Feb 09, 2006 at 02:32:47PM -0800, Brett Cannon wrote:
 
Looks good to me.  Only change I might make is mention why __int__
doesn't work sooner (such as in the rationale).  Otherwise +1 from me.
 
 
 I have a slight reservation about the name. On the one hand it's clear the
 canonical use will be for indexing sequences, and __index__ doesn't look
 enough like __int__ to get people confused on the difference. On the other
 hand, there are other places (in C) that want an actual int, and they could
 use __index__ too. Even more so if a PyArg_Parse* grew a format for 'the
 index-value for this object' ;)
 

There are other places in Python that check specifically for int objects 
and long integer objects and fail with anything else.  Perhaps all of 
these should aslo call the __index__ slot.

But, then it *should* be renamed to i.e. __true_int__.  One such place 
is in abstract.c sequence_repeat function.

The patch I submitted, perhaps aggressivele, changed that function to 
call the nb_index slot as well instead of raising an error.

Perhaps the slot should be called nb_true_int?

-Travis



 On the *other* one hand, I can't think of a good name... but on the other
 other hand, it would be awkward to have to support an old name just because
 the real use wasn't envisioned yet.
 
 One-time-machine-for-the-shortsighted-quadrumanus-please-ly y'r,s

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Help with Unicode arrays in NumPy

2006-02-07 Thread Travis E. Oliphant

This is a design question which is why I'm posting here.  Recently the 
NumPy developers have become more aware of the difference between UCS2 
and UCS4 builds of Python.  NumPy arrays can be of Unicode type.  In 
other words a NumPy array can be made of up fixed-data-length unicode 
strings.

Currently that means that they are unicode strings of basic size UCS2 
or UCS4 depending on the platform.  It is this duality that has some 
people concerned.  For all other data-types, NumPy allows the user to 
explicitly request a bit-width for the data-type.

So, we are thinking of introducing another data-type to NumPy to 
differentiate between UCS2 and UCS4 unicode strings.  (This also means a 
unicode scalar object, i.e. string of each of these, exactly one of 
which will inherit from the Python type).

Before embarking on this journey, however, we are seeking advice from 
individuals wiser to the way of Unicode on this list.

Perhaps all we need to do is be more careful on input and output of 
Unicode data-types so that transfer of unicode can be handled correctly 
on each platform.

Any thoughts?

-Travis Oliphant



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] New PEP: Using ssize_t as the index type

2005-12-30 Thread Travis E. Oliphant
Martin v. Löwis wrote:
 Please let me know what you think.
 
 Regards,
 Martin
 
 PEP: XXX
 Title: Using ssize_t as the index type
 Version: $Revision$
 Last-Modified: $Date$
 Author: Martin v. Löwis [EMAIL PROTECTED]
 Status: Draft
 Type: Standards Track
 Content-Type: text/x-rst
 Created: 18-Dec-2005
 Post-History:
 
 
 Abstract
 
 
 In Python 2.4, indices of sequences are restricted to the C type
 int. On 64-bit machines, sequences therefore cannot use the full
 address space, and are restricted to 2**31 elements. This PEP proposes
 to change this, introducing a platform-specific index type
 Py_ssize_t. An implementation of the proposed change is in
 http://svn.python.org/projects/python/branches/ssize_t.

Sounds wonderful.   Would love to see this in Python 2.5.  This will fix 
important 64-bit issues.  Perhaps someone could explain to me what the 
difference between Py_ssize_t and Py_intptr_t would be? Is this not a 
satisfactory Py_ssize_t already?


 
 
 Rationale
 =
 
 64-bit machines are becoming more popular, and the size of main memory
 increases beyond 4GiB. On such machines, Python currently is limited,
 in that sequences (strings, unicode objects, tuples, lists,
 array.arrays, ...)  cannot contain more than 2GiElements.
 
 Today, very few machines have memory to represent larger lists: as
 each pointer is 8B (in a 64-bit machine), one needs 16GiB to just hold
 the pointers of such a list; with data in the list, the memory
 consumption grows even more.  However, there are three container types
 for which users request improvements today:
 
 * strings (currently restricted to 2GiB)
 * mmap objects (likewise; plus the system typically
   won't keep the whole object in memory concurrently)
 * Numarray objects (from Numerical Python)

scipy_core objects are the replacement for both Numarray and Numerical 
Python and support 64-bit clean objects *except* for the sequence 
protocol and the buffer protocol.  Fixing this problem will clean up a 
lot of unnecessary code.

Looking forward to it...

-Travis



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Problems with the Python Memory Manager

2005-11-24 Thread Travis E. Oliphant
Armin Rigo wrote:
 Hi,
 
 Ok, here is the reason for the leak...
 
 There is in scipy a type called 'int32_arrtype' which inherits from both
 another scipy type called 'signedinteger_arrtype', and from 'int'.
 Obscure!  This is not 100% officially allowed: you are inheriting from
 two C types.  You're living dangerously!

This is allowed because the two types have compatible binaries (in fact 
the signed integer type is only the PyObject_HEAD)

 
 Now in this case it mostly works as expected, because the parent scipy
 type has no field at all, so it's mostly like inheriting from both
 'object' and 'int' -- which is allowed, or would be if the bases were
 written in the opposite order.  But still, something confuses the
 fragile logic of typeobject.c.  (I'll leave this bit to scipy people to
 debug :-)
 

This is definitely possible.  I've tripped up in this logic before.   I 
was beginning to suspect that it might have something to do with what is 
going on.

 The net result is that unless you force your own tp_free as in revision
 1490, the type 'int32_arrtype' has tp_free set to int_free(), which is
 the normal tp_free of 'int' objects.  This causes all deallocated
 int32_arrtype instances to be added to the CPython free list of integers
 instead of being freed!

I'm not sure this is true,  It sounds plausible but I will have to 
check.   Previously the tp_free should have been inherited as 
PyObject_Del for the int32_arrtype.  Unless the typeobject.c code copied 
the tp_free from the wrong base type, this shouldn't have been the case.

Thanks for the pointers.  It sounds like we're getting close.  Perhaps 
the problem is in typeobject.c 


-Travis


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Problems with mro for dual inheritance in C [Was: Problems with the Python Memory Manager]

2005-11-24 Thread Travis E. Oliphant
Armin Rigo wrote:
 Hi,
 
 Ok, here is the reason for the leak...
 
 There is in scipy a type called 'int32_arrtype' which inherits from both
 another scipy type called 'signedinteger_arrtype', and from 'int'.
 Obscure!  This is not 100% officially allowed: you are inheriting from
 two C types.  You're living dangerously!
 
 Now in this case it mostly works as expected, because the parent scipy
 type has no field at all, so it's mostly like inheriting from both
 'object' and 'int' -- which is allowed, or would be if the bases were
 written in the opposite order.  But still, something confuses the
 fragile logic of typeobject.c.  (I'll leave this bit to scipy people to
 debug :-)

Well, I'm stumped on this.  Note the method resolution order for the new 
scalar array type (exactly as I would expect).   Why doesn't the int32 
type inherit its tp_free from the early types first?

a = zeros(10)
type(a[0]).mro()

[type 'int32_arrtype', type 'signedinteger_arrtype', type 
'integer_arrtype',
type 'numeric_arrtype', type 'generic_arrtype', type 'int', type 
'object']




___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com