On 8 May 2012 10:22, mark florisson <markflorisso...@gmail.com> wrote: > On 8 May 2012 09:36, Dag Sverre Seljebotn <d.s.seljeb...@astro.uio.no> wrote: >> On 05/08/2012 10:18 AM, Stefan Behnel wrote: >>> >>> Dag Sverre Seljebotn, 08.05.2012 09:57: >>>> >>>> On 05/07/2012 11:21 PM, mark florisson wrote: >>>>> >>>>> On 7 May 2012 19:40, Dag Sverre Seljebotn wrote: >>>>>> >>>>>> mark florisson wrote: >>>>>>> >>>>>>> On 7 May 2012 17:00, Dag Sverre Seljebotn wrote: >>>>>>>> >>>>>>>> On 05/07/2012 04:16 PM, Stefan Behnel wrote: >>>>>>>>> >>>>>>>>> Stefan Behnel, 07.05.2012 15:04: >>>>>>>>>> >>>>>>>>>> Dag Sverre Seljebotn, 07.05.2012 13:48: >>>>>>>>>>> >>>>>>>>>>> BTW, with the coming of memoryviews, me and Mark talked about just >>>>>>>>>>> deprecating the "mytype[...]" meaning buffers, and rather treat it >>>>>>>>>>> as np.ndarray, array.array etc. being some sort of "template >>>>>>>>>>> types". >>>>>>>>>>> That is, >>>>>>>>>>> we disallow "object[int]" and require some special declarations in >>>>>>>>>>> the relevant pxd files. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Hmm, yes, it's unfortunate that we have two different types of >>>>>>>>>> syntax now, >>>>>>>>>> one that declares the item type before the brackets and one that >>>>>>>>>> declares it afterwards. >>>>>>>>> >>>>>>>>> Should we consider the >>>>>>>>> buffer interface syntax deprecated and focus on the memory view >>>>>>>>> syntax? >>>>>>>> >>>>>>>> >>>>>>>> I think that's the very-long-term intention. Then again, it may be >>>>>>>> too early >>>>>>>> to really tell yet, we just need to see how the memory views play out >>>>>>>> in >>>>>>>> real life and whether they'll be able to replace np.ndarray[double] >>>>>>>> among real users. We don't want to shove things down users throats. >>>>>>>> >>>>>>>> But the use of the trailing-[] syntax needs some cleaning up. Me and >>>>>>>> Mark agreed we'd put this proposal forward when we got around to it: >>>>>>>> >>>>>>>> - Deprecate the "object[double]" form, where [dtype] can be stuck >>>>>>>> on >>>>>>>> any extension type >>>>>>>> >>>>>>>> - But, do NOT (for the next year at least) deprecate >>>>>>>> np.ndarray[double], >>>>>>>> array.array[double], etc. Basically, there should be a magic flag >>>>>>>> in >>>>>>>> extension type declarations saying "I can be a buffer". >>>>>>>> >>>>>>>> For one thing, that is sort of needed to open up things for templated >>>>>>>> cdef classes/fused types cdef classes, if that is ever implemented. >>>>>>> >>>>>>> >>>>>>> Deprecating is definitely a good start. I think at least if you only >>>>>>> allow two types as buffers it will be at least reasonably clear when >>>>>>> one is dealing with fused types or buffers. >>>>>>> >>>>>>> Basically, I think memoryviews should live up to demands of the users, >>>>>>> which would mean there would be no reason to keep the buffer syntax. >>>>>> >>>>>> >>>>>> But they are different approaches -- use a different type/API, or just >>>>>> try to speed up parts of NumPy.. >>>>>> >>>>>>> One thing to do is make memoryviews coerce cheaply back to the >>>>>>> original objects if wanted (which is likely). Writting >>>>>>> np.asarray(mymemview) is kind of annoying. >>>>>> >>>>>> >>>>>> It is going to be very confusing to have type(mymemview), >>>>>> repr(mymemview), and so on come out as NumPy arrays, but not have the >>>>>> full API of NumPy. Unless you auto-convert on getattr to... >>>>> >>>>> >>>>> Yeah, the idea is as very simple, as you mention, just keep the object >>>>> around cached, and when you slice construct one lazily. >>>>> >>>>>> If you want to eradicate the distinction between the backing array and >>>>>> the memory view and make it transparent, I really suggest you kick back >>>>>> alive np.ndarray (it can exist in some 'unrealized' state with delayed >>>>>> construction after slicing, and so on). Implementation much the same >>>>>> either way, it is all about how it is presented to the user. >>>>> >>>>> >>>>> You mean the buffer syntax? >>>>> >>>>>> Something like mymemview.asobject() could work though, and while not >>>>>> much shorter, it would have some polymorphism that np.asarray does not >>>>>> have (based probably on some custom PEP 3118 extension) >>>>> >>>>> >>>>> I was thinking you could allow the user to register a callback, and >>>>> use that to coerce from a memoryview back to an object (given a >>>>> memoryview object). For numpy this would be np.asarray, and the >>>>> implementation is allowed to cache the result (which it will). >>>>> It may be too magicky though... but it will be convenient. The >>>>> memoryview will act as a subclass, meaning that any of its methods >>>>> will override methods of the converted object. >>>> >>>> >>>> My point was that this seems *way* to magicky. >>>> >>>> Beyond "confusing users" and so on that are sort of subjective, here's a >>>> fundamental problem for you: We're making it very difficult to type-infer >>>> memoryviews. Consider: >>>> >>>> cdef double[:] x = ... >>>> y = x >>>> print y.shape >>>> >>>> Now, because y is not typed, you're semantically throwing in a conversion >>>> on line 2, so that line 3 says that you want the attribute access to be >>>> invoked on "whatever object x coerced back to". And we have no idea what >>>> kind of object that is. >>>> >>>> If you don't transparently convert to object, it'd be safe to >>>> automatically >>>> infer y as a double[:]. >>> >>> >>> Why can't y be inferred as the type of x due to the assignment? >>> >>> >>>> On a related note, I've said before that I dislike the notion of >>>> >>>> cdef double[:] mview = obj >>>> >>>> I'd rather like >>>> >>>> cdef double[:] mview = double[:](obj) >>> >>> >>> Why? We currently allow >>> >>> cdef char* s = some_py_bytes_string >>> >>> Auto-coercion is a serious part of the language, and I don't see the >>> advantage of requiring the redundancy in the case above. It's clear enough >>> to me what the typed assignment is intended to mean: get me a buffer view >>> on the object, regardless of what it is. >>> >>> >>>> I support Robert in that "np.ndarray[double]" is the syntax to use when >>>> you >>>> want this kind of transparent "be an object when I need to and a memory >>>> view when I need to". >>>> >>>> Proposal: >>>> >>>> 1) We NEVER deprecate "np.ndarray[double]", we commit to keeping that in >>>> the language. It means exactly what you would like double[:] to mean, >>>> i.e. >>>> a variable that is memoryview when you need to and an object otherwise. >>>> When you use this type, you bear the consequences of early-binding things >>>> that could in theory be overridden. >>>> >>>> 2) double[:] is for when you want to access data of *any* Python object >>>> in >>>> a generic way. Raw PEP 3118. In those situations, access to the >>>> underlying >>>> object is much less useful. >>>> >>>> 2a) Therefore we require that you do "mview.asobject()" manually; doing >>>> "mview.foo()" is a compile-time error >>> >>> >>> Sounds good. I think that would clean up the current syntax overlap very >>> nicely. >>> >>> >>>> 2b) To drive the point home among users, and aid type inference and >>>> overall language clarity, we REMOVE the auto-acquisition and require that >>>> you do >>>> >>>> cdef double[:] mview = double[:](obj) >>> >>> >>> I don't see the point, as noted above. Either "obj" is statically typed >>> and >>> the bare assignment becomes a no-op, or it's not typed and the assignment >>> coerces by creating a view. As with all other typed assignments. >>> >>> >>>> 2c) Perhaps: Do not even coerce to a Python memoryview and disallow >>>> "print mview"; instead require that you do "print mview.asmemoryview()" >>>> or >>>> "print memoryview(mview)" or somesuch. >>> >>> >>> This seems to depend on 2b. >> >> >> This I don't understand. The question of 2c) is the analogue to >> auto-coercion of "char*" to bytes; approving 2c) would put memoryviews in >> line with char*. >> >> Then again, we could in future auto-coerce char* to a ctypes pointer, and in >> that case, coercing a memoryview to an object representing that memoryview >> would be OK. > > Character pointers coerce to strings. Hell, even structs coerce to and > from python dicts, so disallowing the same for memoryviews would just > be inconsistent and inconvenient.
Also, if you don't allow coercion from python, then it means they also cannot be used as 'def' function arguments and be called from python. >> Either way, you would never get back the same object that you coerced from! >> >> Dag >> >> _______________________________________________ >> cython-devel mailing list >> cython-devel@python.org >> http://mail.python.org/mailman/listinfo/cython-devel _______________________________________________ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel