On 05/09/2012 09:08 PM, mark florisson wrote:
On 9 May 2012 19:56, Robert Bradshaw<rober...@gmail.com>  wrote:
On Tue, May 8, 2012 at 3:35 AM, mark florisson
<markflorisso...@gmail.com>  wrote:
On 8 May 2012 10:47, Dag Sverre Seljebotn<d.s.seljeb...@astro.uio.no>  wrote:

After some thinking I believe I can see more clearly where Mark is coming
from. To sum up, it's either

A) Keep both np.ndarray[double] and double[:] around, with clearly defined
and separate roles. np.ndarray[double] implementation is revamped to allow
fast slicing etc., based on the double[:] implementation.

B) Deprecate np.ndarray[double] sooner rather than later, but make double[:]
have functionality that is *really* close to what np.ndarray[double]
currently does. In most cases one should be able to basically replace
np.ndarray[double] with double[:] and the code should continue to work just
like before; difference is that if you pass in anything else than a NumPy
array, it will likely fail with a runtime AttributeError at some point
rather than fail a PyType_Check.

That's a good summary. I have a big preference for B here, but I agree
that treating a typed memoryview as both a user object (possibly
converted through callback) and a typed memoryview "subclass" is quite
magicky.

With the talk of overlay modules and go-style interface, being able to
specify the type of an object as well as its bufferness could become
more interesting than it even is now. The notion of supporting
multiple interfaces, e.g.

cdef np.ndarray&  double[:] my_array

could obviate the need for np.ndarray[double]. Until we support
something like this, or decide to reject it, I think we need to keep
the old-style syntax around. (np.ndarray[double] could even become
this intersection type to gain all the new features before we decide
on a appropriate syntax).

It's kind of interesting but also kind of a pain to declare everywhere
like that. Buffer syntax should by no means deprecated in the near
future, but at some point it will be better to have one way to do
things, whether slightly magicky or more convoluted or not. Also, as
Dag mentioned, if we want fused extension types it makes more sense to
remove buffer syntax to disambiguate this and avoid context-dependent
special casing (e.g. np.ndarray and array.array).

I don't think it hurts to have two ways of doing things if they are sufficiently well-motivated, sufficiently well-defined, and sufficiently different from one another.

The original reason I wanted double[:] was to stop tying ourselves to NumPy and don't promise to be compatible, because of the polymorphic aspect of NumPy. I think in the future, the Python behaviour of, say, +, in np.ndarray is going to be different from what we have today. You'll have the + fetching data over the network in some cases, or treating NA in special ways (I think there might be over a thousand about NA on the NumPy now?). In short, lots of stuff can be going on that we can't emulate in Cython.

OTOH, perhaps that doesn't matter -- we just raise an exception for the NumPy arrays that we can't deal with, and move on...

I wouldn't particularly mind something concise like 'm.obj'.
The AttributeError would be the case as usual, when a python object
doesn't have the right interface.

Having to insert the .obj in there does make it more painful to
convert existing Python code.

Yes, hence my slight bias towards magicky. But I do fully agree with
all opposing arguments that say "too much magic". I just prefer to be
pragmatic here :)

It's a very big decision. I think two or three alternatives are starting to crystallise; but to choose between them I think it calls for a CEP with code examples, and a request for comment on both cython-users and numpy-discussion.

Until that happens, avoiding any magic seems like a conservative forward-compatible default.

Dag
_______________________________________________
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel

Reply via email to