Re: [Cython] buffer syntax vs. memory view syntax

Dag Sverre Seljebotn Tue, 08 May 2012 00:57:57 -0700

On 05/07/2012 11:21 PM, mark florisson wrote:

On 7 May 2012 19:40, Dag Sverre Seljebotn<[email protected]>  wrote:



mark florisson<[email protected]>  wrote:

On 7 May 2012 17:00, Dag Sverre Seljebotn<[email protected]>
wrote:

On 05/07/2012 04:16 PM, Stefan Behnel wrote:


Stefan Behnel, 07.05.2012 15:04:


Dag Sverre Seljebotn, 07.05.2012 13:48:


BTW, with the coming of memoryviews, me and Mark talked about just
deprecating the "mytype[...]" meaning buffers, and rather treat it

as

np.ndarray, array.array etc. being some sort of "template types".

That

is,
we disallow "object[int]" and require some special declarations in

the

relevant pxd files.



Hmm, yes, it's unfortunate that we have two different types of

syntax

now,
one that declares the item type before the brackets and one that

declares

it afterwards.



I actually think this merits some more discussion. Should we

consider the

buffer interface syntax deprecated and focus on the memory view

syntax?



I think that's the very-long-term intention. Then again, it may be

too early

to really tell yet, we just need to see how the memory views play out

in

real life and whether they'll be able to replace np.ndarray[double]

among

real users. We don't want to shove things down users throats.

But the use of the trailing-[] syntax needs some cleaning up. Me and

Mark

agreed we'd put this proposal forward when we got around to it:

  - Deprecate the "object[double]" form, where [dtype] can be stuck on

any

extension type

  - But, do NOT (for the next year at least) deprecate

np.ndarray[double],

array.array[double], etc. Basically, there should be a magic flag in
extension type declarations saying "I can be a buffer".

For one thing, that is sort of needed to open up things for templated

cdef

classes/fused types cdef classes, if that is ever implemented.


Deprecating is definitely a good start. I think at least if you only
allow two types as buffers it will be at least reasonably clear when
one is dealing with fused types or buffers.

Basically, I think memoryviews should live up to demands of the users,
which would mean there would be no reason to keep the buffer syntax.


But they are different approaches -- use a different type/API, or just try to 
speed up parts of NumPy..

One thing to do is make memoryviews coerce cheaply back to the
original objects if wanted (which is likely). Writting
np.asarray(mymemview) is kind of annoying.



It is going to be very confusing to have type(mymemview), repr(mymemview), and 
so on come out as NumPy arrays, but not have the full API of NumPy. Unless you 
auto-convert on getattr to...


Yeah, the idea is as very simple, as you mention, just keep the object
around cached, and when you slice construct one lazily.

If you want to eradicate the distinction between the backing array and the 
memory view and make it transparent, I really suggest you kick back alive 
np.ndarray (it can exist in some 'unrealized' state with delayed construction 
after slicing, and so on). Implementation much the same either way, it is all 
about how it is presented to the user.


You mean the buffer syntax?

Something like mymemview.asobject() could work though, and while not much 
shorter, it would have some polymorphism that np.asarray does not have (based 
probably on some custom PEP 3118 extension)


I was thinking you could allow the user to register a callback, and
use that to coerce from a memoryview back to an object (given a
memoryview object). For numpy this would be np.asarray, and the
implementation is allowed to cache the result (which it will).
It may be too magicky though... but it will be convenient. The
memoryview will act as a subclass, meaning that any of its methods
will override methods of the converted object.


My point was that this seems *way* to magicky.

Beyond "confusing users" and so on that are sort of subjective, here's afundamental problem for you: We're making it very difficult totype-infer memoryviews. Consider:


cdef double[:] x = ...
y = x
print y.shape

Now, because y is not typed, you're semantically throwing in aconversion on line 2, so that line 3 says that you want the attributeaccess to be invoked on "whatever object x coerced back to". And we haveno idea what kind of object that is.

If you don't transparently convert to object, it'd be safe toautomatically infer y as a double[:].


On a related note, I've said before that I dislike the notion of

cdef double[:] mview = obj

I'd rather like

cdef double[:] mview = double[:](obj)

I support Robert in that "np.ndarray[double]" is the syntax to use whenyou want this kind of transparent "be an object when I need to and amemory view when I need to".


Proposal:

1) We NEVER deprecate "np.ndarray[double]", we commit to keeping thatin the language. It means exactly what you would like double[:] to mean,i.e. a variable that is memoryview when you need to and an objectotherwise. When you use this type, you bear the consequences ofearly-binding things that could in theory be overridden.

2) double[:] is for when you want to access data of *any* Pythonobject in a generic way. Raw PEP 3118. In those situations, access tothe underlying object is much less useful.

2a) Therefore we require that you do "mview.asobject()" manually;doing "mview.foo()" is a compile-time error

2b) To drive the point home among users, and aid type inference andoverall language clarity, we REMOVE the auto-acquisition and requirethat you do


    cdef double[:] mview = double[:](obj)

2c) Perhaps: Do not even coerce to a Python memoryview and disallow"print mview"; instead require that you do "print mview.asmemoryview()"or "print memoryview(mview)" or somesuch.

(A related proposal that's been up earlier has been that a variable canbe annotated with many interfaces; e.g.


cdef A|B|C obj

...and then when you do "obj.method", it is first looked up in C, thenB, then A, then Python getattr. Not sure if we want to reopen that canof worms...)


Dag
_______________________________________________
cython-devel mailing list
[email protected]
http://mail.python.org/mailman/listinfo/cython-devel

Re: [Cython] buffer syntax vs. memory view syntax

Reply via email to