On 8 May 2012 09:49, Stefan Behnel <stefan...@behnel.de> wrote: > Dag Sverre Seljebotn, 08.05.2012 10:36: >> On 05/08/2012 10:18 AM, Stefan Behnel wrote: >>> Dag Sverre Seljebotn, 08.05.2012 09:57: >>>> On 05/07/2012 11:21 PM, mark florisson wrote: >>>>> On 7 May 2012 19:40, Dag Sverre Seljebotn wrote: >>>>>> mark florisson wrote: >>>>>>> On 7 May 2012 17:00, Dag Sverre Seljebotn wrote: >>>>>>>> On 05/07/2012 04:16 PM, Stefan Behnel wrote: >>>>>>>>> Stefan Behnel, 07.05.2012 15:04: >>>>>>>>>> Dag Sverre Seljebotn, 07.05.2012 13:48: >>>>>>>>>>> BTW, with the coming of memoryviews, me and Mark talked about just >>>>>>>>>>> deprecating the "mytype[...]" meaning buffers, and rather treat it >>>>>>>>>>> as np.ndarray, array.array etc. being some sort of "template types". >>>>>>>>>>> That is, >>>>>>>>>>> we disallow "object[int]" and require some special declarations in >>>>>>>>>>> the relevant pxd files. >>>>>>>>>> >>>>>>>>>> Hmm, yes, it's unfortunate that we have two different types of >>>>>>>>>> syntax now, >>>>>>>>>> one that declares the item type before the brackets and one that >>>>>>>>>> declares it afterwards. >>>>>>>>> Should we consider the >>>>>>>>> buffer interface syntax deprecated and focus on the memory view >>>>>>>>> syntax? >>>>>>>> >>>>>>>> I think that's the very-long-term intention. Then again, it may be >>>>>>>> too early >>>>>>>> to really tell yet, we just need to see how the memory views play out >>>>>>>> in >>>>>>>> real life and whether they'll be able to replace np.ndarray[double] >>>>>>>> among real users. We don't want to shove things down users throats. >>>>>>>> >>>>>>>> But the use of the trailing-[] syntax needs some cleaning up. Me and >>>>>>>> Mark agreed we'd put this proposal forward when we got around to it: >>>>>>>> >>>>>>>> - Deprecate the "object[double]" form, where [dtype] can be stuck on >>>>>>>> any extension type >>>>>>>> >>>>>>>> - But, do NOT (for the next year at least) deprecate >>>>>>>> np.ndarray[double], >>>>>>>> array.array[double], etc. Basically, there should be a magic flag in >>>>>>>> extension type declarations saying "I can be a buffer". >>>>>>>> >>>>>>>> For one thing, that is sort of needed to open up things for templated >>>>>>>> cdef classes/fused types cdef classes, if that is ever implemented. >>>>>>> >>>>>>> Deprecating is definitely a good start. I think at least if you only >>>>>>> allow two types as buffers it will be at least reasonably clear when >>>>>>> one is dealing with fused types or buffers. >>>>>>> >>>>>>> Basically, I think memoryviews should live up to demands of the users, >>>>>>> which would mean there would be no reason to keep the buffer syntax. >>>>>> >>>>>> But they are different approaches -- use a different type/API, or just >>>>>> try to speed up parts of NumPy.. >>>>>> >>>>>>> One thing to do is make memoryviews coerce cheaply back to the >>>>>>> original objects if wanted (which is likely). Writting >>>>>>> np.asarray(mymemview) is kind of annoying. >>>>>> >>>>>> It is going to be very confusing to have type(mymemview), >>>>>> repr(mymemview), and so on come out as NumPy arrays, but not have the >>>>>> full API of NumPy. Unless you auto-convert on getattr to... >>>>> >>>>> Yeah, the idea is as very simple, as you mention, just keep the object >>>>> around cached, and when you slice construct one lazily. >>>>> >>>>>> If you want to eradicate the distinction between the backing array and >>>>>> the memory view and make it transparent, I really suggest you kick back >>>>>> alive np.ndarray (it can exist in some 'unrealized' state with delayed >>>>>> construction after slicing, and so on). Implementation much the same >>>>>> either way, it is all about how it is presented to the user. >>>>> >>>>> You mean the buffer syntax? >>>>> >>>>>> Something like mymemview.asobject() could work though, and while not >>>>>> much shorter, it would have some polymorphism that np.asarray does not >>>>>> have (based probably on some custom PEP 3118 extension) >>>>> >>>>> I was thinking you could allow the user to register a callback, and >>>>> use that to coerce from a memoryview back to an object (given a >>>>> memoryview object). For numpy this would be np.asarray, and the >>>>> implementation is allowed to cache the result (which it will). >>>>> It may be too magicky though... but it will be convenient. The >>>>> memoryview will act as a subclass, meaning that any of its methods >>>>> will override methods of the converted object. >>>> >>>> My point was that this seems *way* to magicky. >>>> >>>> Beyond "confusing users" and so on that are sort of subjective, here's a >>>> fundamental problem for you: We're making it very difficult to type-infer >>>> memoryviews. Consider: >>>> >>>> cdef double[:] x = ... >>>> y = x >>>> print y.shape >>>> >>>> Now, because y is not typed, you're semantically throwing in a conversion >>>> on line 2, so that line 3 says that you want the attribute access to be >>>> invoked on "whatever object x coerced back to". And we have no idea what >>>> kind of object that is. >>>> >>>> If you don't transparently convert to object, it'd be safe to automatically >>>> infer y as a double[:]. >>> >>> Why can't y be inferred as the type of x due to the assignment? >>> >>> >>>> On a related note, I've said before that I dislike the notion of >>>> >>>> cdef double[:] mview = obj >>>> >>>> I'd rather like >>>> >>>> cdef double[:] mview = double[:](obj) >>> >>> Why? We currently allow >>> >>> cdef char* s = some_py_bytes_string >>> >>> Auto-coercion is a serious part of the language, and I don't see the >>> advantage of requiring the redundancy in the case above. It's clear enough >>> to me what the typed assignment is intended to mean: get me a buffer view >>> on the object, regardless of what it is. >>> >>> >>>> I support Robert in that "np.ndarray[double]" is the syntax to use when you >>>> want this kind of transparent "be an object when I need to and a memory >>>> view when I need to". >>>> >>>> Proposal: >>>> >>>> 1) We NEVER deprecate "np.ndarray[double]", we commit to keeping that in >>>> the language. It means exactly what you would like double[:] to mean, i.e. >>>> a variable that is memoryview when you need to and an object otherwise. >>>> When you use this type, you bear the consequences of early-binding things >>>> that could in theory be overridden. >>>> >>>> 2) double[:] is for when you want to access data of *any* Python >>>> object in >>>> a generic way. Raw PEP 3118. In those situations, access to the underlying >>>> object is much less useful. >>>> >>>> 2a) Therefore we require that you do "mview.asobject()" manually; doing >>>> "mview.foo()" is a compile-time error >>> >>> Sounds good. I think that would clean up the current syntax overlap very >>> nicely. >>> >>> >>>> 2b) To drive the point home among users, and aid type inference and >>>> overall language clarity, we REMOVE the auto-acquisition and require that >>>> you do >>>> >>>> cdef double[:] mview = double[:](obj) >>> >>> I don't see the point, as noted above. Either "obj" is statically typed and >>> the bare assignment becomes a no-op, or it's not typed and the assignment >>> coerces by creating a view. As with all other typed assignments. >>> >>> >>>> 2c) Perhaps: Do not even coerce to a Python memoryview and disallow >>>> "print mview"; instead require that you do "print mview.asmemoryview()" or >>>> "print memoryview(mview)" or somesuch. >>> >>> This seems to depend on 2b. >> >> This I don't understand. The question of 2c) is the analogue to >> auto-coercion of "char*" to bytes; approving 2c) would put memoryviews in >> line with char*. >> >> Then again, we could in future auto-coerce char* to a ctypes pointer, and >> in that case, coercing a memoryview to an object representing that >> memoryview would be OK. >> >> Either way, you would never get back the same object that you coerced from! > > Ah, that's what you meant. I thought you were referring to getting a > memoryview from an object. > > I agree that a buffer view shouldn't auto-coerce back to its owner (or to a > Python object in general), that's the whole point of the syntax cleanup. > > In simple cases, buffer.obj would be the thing to talk to, except for > memory views, where only the view knows the mapped memory layout but the > underlying exporter has the methods to deal with the buffer. In that case, > we may really want to leave it to the user to handle this. I don't think > the compiler can do the right thing in all cases, and the user is really > the only one who knows what kind of object should be used or even > instantiated to wrap a buffer. Nothing we can do is shorter or more clearly > readable than np.asarray() or whatever function a specific library has for > this. > > So, what about just keeping buffer.obj visible and leaving everything else > to users?
buffer.base gets you the original object. > Stefan > _______________________________________________ > cython-devel mailing list > cython-devel@python.org > http://mail.python.org/mailman/listinfo/cython-devel _______________________________________________ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel