On 7 May 2012 12:51, Dag Sverre Seljebotn <d.s.seljeb...@astro.uio.no> wrote: > On 05/07/2012 01:48 PM, Dag Sverre Seljebotn wrote: >> >> On 05/07/2012 01:10 PM, Stefan Behnel wrote: >>> >>> Dag Sverre Seljebotn, 07.05.2012 12:40: >>>> >>>> moving to dev list >>> >>> >>> Makes sense. >>> >>>> On 05/07/2012 11:17 AM, Stefan Behnel wrote: >>>>> >>>>> Dag Sverre Seljebotn, 07.05.2012 10:44: >>>>>> >>>>>> On 05/07/2012 07:48 AM, Stefan Behnel wrote: >>>>>>> >>>>>>> I wonder why a memory view should be allowed to be None in the first >>>>>>> place. >>>>>>> Buffer arguments aren't (because they get unpacked on entry), so why >>>>>>> should memory views? >>>>>> >>>>>> >>>>>> ? At least when I implemented it, buffers get unpacked but the case >>>>>> of a >>>>>> None buffer is treated specially, and you're fully allowed (and >>>>>> segfault if >>>>>> you [] it). >>>>> >>>>> >>>>> Hmm, ok, maybe I just got confused by the code then. >>>>> >>>>> I think the docs should state that buffer arguments are best used >>>>> together >>>>> with the "not None" declaration then. >>> >>> >>> ... which made me realise that that wasn't even supported. I can't >>> believe >>> no-one ever reported that as a bug... >>> >>> >>> https://github.com/cython/cython/commit/f2de49fd0ac82a02a070b931bf4d2dab47135d0b >>> >>> >>> It's still not supported for memory views. >>> >>> BTW, is there a reason why we shouldn't allow a "not None" declaration >>> for >>> cdef functions? Obviously, the caller would have to do the check in that >>> case. Hmm, maybe it's not that important, because None checks are best >>> done >>> at entry points from user code, which usually means Python code. It seems >>> like "not None" is not supported on cpdef functions, though. >>> >>> >>>> I use them with "=None" default values all the time... then do a >>>> None-check manually. >>> >>> >>> Interesting. Could you given an example? What's the advantage over >>> letting >>> Cython raise an error for you? And, since you are using it as a default >>> argument, why would someone want to call your code entirely without a >>> buffer argument? >> >> >> Here you go: >> >> def foo(np.ndarray[double] a, np.ndarray[double] out=None): >> if out is None: >> out = np.empty_like(a) >> # compute result in out >> return out >> >> The pattern of handing in the memory area to write to is one of the >> fundamental basics of numerical computing; you often just can't >> implement an algorithm if the called function returns the result in a >> newly-allocated array. I can explain why that is in detail, but I'd >> rather you just trusted the testimony of somebody doing numerical >> computation... >> >> It's just a convenience, but often (in particular when testing) it's >> incredibly convenient to not have to bother with allocating the output >> array. >> >> Another pattern is: >> >> def do_something(np.ndarray[double] a, >> np.ndarray[double] sin_of_a=None): >> ... >> >> so if your caller happened to already have computed something, the >> function uses it, but OTOH the "something" is a function of the inputs >> and can be computed on the fly. AND, sometimes it can be computed on the >> fly in ways more efficient than what the caller could have done, because >> of memory bus issues etc. etc. >> >> Both of these can be "fixed" by a) not allowing the convenient >> shorthand, or b) declare the argument "object" first and then type it >> after the "preamble". >> >> So the REAL reason I'm arguing this case is consistency with cdef classes. >> >> >> >>> >>> >>>> It's really no different from cdef classes. >>> >>> >>> I find it at least a bit more surprising because a buffer unpacking >>> argument is a rather strong hint that you expect something that supports >>> this protocol. The fact that you type your function argument with it >>> hints >>> at the intention to properly unpack it on entry. I'm sure there are >>> lots of >>> users who were or will be surprised when they realise that that doesn't >>> exclude None values. >> >> >> Whereas I think there would be more users surprised by the opposite. >> >> So there -- we won't know who's right without actually finding some >> users. And chances are we are both right, since users are different from >> one another. >> >>> >>> >>>>> And I remember that we wanted to change the default settings for >>>>> extension >>>>> type arguments from "or None" to "not None" years ago but never >>>>> actually >>>>> did it. >>>> >>>> >>>> I remember that there was such a debate, but I certainly don't remember >>>> that this was the conclusion :-) >>> >>> >>> Maybe not, yes. >>> >>> >>>> I didn't agree with that view then and >>>> I don't now. I don't remember what Robert's view was... >>>> >>>> As far as I can remember (which might be biased towards my personal >>>> view), the conclusion was that we left the current semantics in place, >>>> relying on better control flow analysis to make None-checks cheaper, and >>>> when those are cheap enough, make the nonecheck directive default to >>>> True >>> >>> >>> At least for buffer arguments, it silently corrupts data or segfaults in >>> the current state of affairs, as you pointed out. Not exactly ideal. >> >> >> No different than writing to a field in a cdef class... > > > Also, I believe that in the strided case, the strides are all set to 0, and > the data-pointer is NULL, so you will never corrupt data, you will always > try to access *NULL and segfault. > > Though If you put mode='c' and a very high index you'll corrupt data. > > Dag >
If you have boundschecking on, you'll get an out of bounds error, which is pretty weird :) >> >>> >>> That's another reason why I see a difference between the behaviour of >>> extension types and that of buffer arguments. Buffer indexing is also way >>> more performance critical than the average method call or attribute >>> access >>> on a cdef class. >> >> >> Perhaps, but that's a bit hand-wavy to turn into a principle of language >> design? "This is performance critical, so therefore we suddenly invert >> the normal rule"? >> >> I just think we should be consistent, not have more special rules for >> buffers than we need to. >> >> The intention all the time was that "np.ndarray[double]" is just a >> glorified "np.ndarray". People expect it to behave like an optimized >> "np.ndarray". If "np.ndarray" can be None, why can't "np.ndarray[double]"? >> >> BTW, with the coming of memoryviews, me and Mark talked about just >> deprecating the "mytype[...]" meaning buffers, and rather treat it as >> np.ndarray, array.array etc. being some sort of "template types". That >> is, we disallow "object[int]" and require some special declarations in >> the relevant pxd files. >> >>>> (Java is sort of prior art that this can indeed be done?). >>> >>> >>> Java was designed to have a JIT compiler underneath which handles >>> external >>> parameters, and its compilers are way smarter than Cython. I agree that >>> there is still a lot we can do based on better static analysis, but there >>> will always be limits. >> >> >> Any static analysis will be able to get you to the point of "not None" >> if the user has a manual test. And the Python way is often to just spell >> things out rather than brevity; I think an explicit if-test is much more >> newbie friendly than "not None", "or None", etc. >> >> Performance beyond that is rather theoretical for the moment. >> >> I agree that for memoryviews that can be passed in acquired-state to >> cdef functions there is the question of eliminating an extra branch or >> so, but that is still far-fetched, and I'd rather Mark raise the issue >> if it comes an issue than the two of us bikeshedding over it. >> >> I'll try to make this my last post to this thread, I feel we're slipping >> into Dag-and-Stefan-endless-thread territory... >> >> Dag > > > _______________________________________________ > cython-devel mailing list > cython-devel@python.org > http://mail.python.org/mailman/listinfo/cython-devel _______________________________________________ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel