Dag Sverre Seljebotn, 08.05.2012 09:57: > On 05/07/2012 11:21 PM, mark florisson wrote: >> On 7 May 2012 19:40, Dag Sverre Seljebotn wrote: >>> mark florisson wrote: >>>> On 7 May 2012 17:00, Dag Sverre Seljebotn wrote: >>>>> On 05/07/2012 04:16 PM, Stefan Behnel wrote: >>>>>> Stefan Behnel, 07.05.2012 15:04: >>>>>>> Dag Sverre Seljebotn, 07.05.2012 13:48: >>>>>>>> BTW, with the coming of memoryviews, me and Mark talked about just >>>>>>>> deprecating the "mytype[...]" meaning buffers, and rather treat it >>>>>>>> as np.ndarray, array.array etc. being some sort of "template types". >>>>>>>> That is, >>>>>>>> we disallow "object[int]" and require some special declarations in >>>>>>>> the relevant pxd files. >>>>>>> >>>>>>> Hmm, yes, it's unfortunate that we have two different types of >>>>>>> syntax now, >>>>>>> one that declares the item type before the brackets and one that >>>>>>> declares it afterwards. >>>>>> Should we consider the >>>>>> buffer interface syntax deprecated and focus on the memory view >>>>>> syntax? >>>>> >>>>> I think that's the very-long-term intention. Then again, it may be >>>>> too early >>>>> to really tell yet, we just need to see how the memory views play out >>>>> in >>>>> real life and whether they'll be able to replace np.ndarray[double] >>>>> among real users. We don't want to shove things down users throats. >>>>> >>>>> But the use of the trailing-[] syntax needs some cleaning up. Me and >>>>> Mark agreed we'd put this proposal forward when we got around to it: >>>>> >>>>> - Deprecate the "object[double]" form, where [dtype] can be stuck on >>>>> any extension type >>>>> >>>>> - But, do NOT (for the next year at least) deprecate >>>>> np.ndarray[double], >>>>> array.array[double], etc. Basically, there should be a magic flag in >>>>> extension type declarations saying "I can be a buffer". >>>>> >>>>> For one thing, that is sort of needed to open up things for templated >>>>> cdef classes/fused types cdef classes, if that is ever implemented. >>>> >>>> Deprecating is definitely a good start. I think at least if you only >>>> allow two types as buffers it will be at least reasonably clear when >>>> one is dealing with fused types or buffers. >>>> >>>> Basically, I think memoryviews should live up to demands of the users, >>>> which would mean there would be no reason to keep the buffer syntax. >>> >>> But they are different approaches -- use a different type/API, or just >>> try to speed up parts of NumPy.. >>> >>>> One thing to do is make memoryviews coerce cheaply back to the >>>> original objects if wanted (which is likely). Writting >>>> np.asarray(mymemview) is kind of annoying. >>> >>> It is going to be very confusing to have type(mymemview), >>> repr(mymemview), and so on come out as NumPy arrays, but not have the >>> full API of NumPy. Unless you auto-convert on getattr to... >> >> Yeah, the idea is as very simple, as you mention, just keep the object >> around cached, and when you slice construct one lazily. >> >>> If you want to eradicate the distinction between the backing array and >>> the memory view and make it transparent, I really suggest you kick back >>> alive np.ndarray (it can exist in some 'unrealized' state with delayed >>> construction after slicing, and so on). Implementation much the same >>> either way, it is all about how it is presented to the user. >> >> You mean the buffer syntax? >> >>> Something like mymemview.asobject() could work though, and while not >>> much shorter, it would have some polymorphism that np.asarray does not >>> have (based probably on some custom PEP 3118 extension) >> >> I was thinking you could allow the user to register a callback, and >> use that to coerce from a memoryview back to an object (given a >> memoryview object). For numpy this would be np.asarray, and the >> implementation is allowed to cache the result (which it will). >> It may be too magicky though... but it will be convenient. The >> memoryview will act as a subclass, meaning that any of its methods >> will override methods of the converted object. > > My point was that this seems *way* to magicky. > > Beyond "confusing users" and so on that are sort of subjective, here's a > fundamental problem for you: We're making it very difficult to type-infer > memoryviews. Consider: > > cdef double[:] x = ... > y = x > print y.shape > > Now, because y is not typed, you're semantically throwing in a conversion > on line 2, so that line 3 says that you want the attribute access to be > invoked on "whatever object x coerced back to". And we have no idea what > kind of object that is. > > If you don't transparently convert to object, it'd be safe to automatically > infer y as a double[:].
Why can't y be inferred as the type of x due to the assignment? > On a related note, I've said before that I dislike the notion of > > cdef double[:] mview = obj > > I'd rather like > > cdef double[:] mview = double[:](obj) Why? We currently allow cdef char* s = some_py_bytes_string Auto-coercion is a serious part of the language, and I don't see the advantage of requiring the redundancy in the case above. It's clear enough to me what the typed assignment is intended to mean: get me a buffer view on the object, regardless of what it is. > I support Robert in that "np.ndarray[double]" is the syntax to use when you > want this kind of transparent "be an object when I need to and a memory > view when I need to". > > Proposal: > > 1) We NEVER deprecate "np.ndarray[double]", we commit to keeping that in > the language. It means exactly what you would like double[:] to mean, i.e. > a variable that is memoryview when you need to and an object otherwise. > When you use this type, you bear the consequences of early-binding things > that could in theory be overridden. > > 2) double[:] is for when you want to access data of *any* Python object in > a generic way. Raw PEP 3118. In those situations, access to the underlying > object is much less useful. > > 2a) Therefore we require that you do "mview.asobject()" manually; doing > "mview.foo()" is a compile-time error Sounds good. I think that would clean up the current syntax overlap very nicely. > 2b) To drive the point home among users, and aid type inference and > overall language clarity, we REMOVE the auto-acquisition and require that > you do > > cdef double[:] mview = double[:](obj) I don't see the point, as noted above. Either "obj" is statically typed and the bare assignment becomes a no-op, or it's not typed and the assignment coerces by creating a view. As with all other typed assignments. > 2c) Perhaps: Do not even coerce to a Python memoryview and disallow > "print mview"; instead require that you do "print mview.asmemoryview()" or > "print memoryview(mview)" or somesuch. This seems to depend on 2b. > (A related proposal that's been up earlier has been that a variable can be > annotated with many interfaces; e.g. > > cdef A|B|C obj > > ...and then when you do "obj.method", it is first looked up in C, then B, > then A, then Python getattr. Not sure if we want to reopen that can of > worms...) Different topic - new thread? Stefan _______________________________________________ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel