On 4 February 2012 19:39, Dag Sverre Seljebotn <d.s.seljeb...@astro.uio.no> wrote: > On 02/03/2012 07:26 PM, mark florisson wrote: >> >> On 3 February 2012 18:15, Dag Sverre Seljebotn >> <d.s.seljeb...@astro.uio.no> wrote: >>> >>> On 02/03/2012 07:07 PM, mark florisson wrote: >>>> >>>> >>>> On 3 February 2012 18:06, mark florisson<markflorisso...@gmail.com> >>>> wrote: >>>>> >>>>> >>>>> On 3 February 2012 17:53, Dag Sverre Seljebotn >>>>> <d.s.seljeb...@astro.uio.no> wrote: >>>>>> >>>>>> >>>>>> On 02/03/2012 12:09 AM, mark florisson wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 2 February 2012 21:38, Dag Sverre Seljebotn >>>>>>> <d.s.seljeb...@astro.uio.no> wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On 02/02/2012 10:16 PM, mark florisson wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On 2 February 2012 12:19, Dag Sverre Seljebotn >>>>>>>>> <d.s.seljeb...@astro.uio.no> wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I just realized that >>>>>>>>>> >>>>>>>>>> cdef int[:] a = None >>>>>>>>>> >>>>>>>>>> raises an exception; even though I'd argue that 'a' is of the >>>>>>>>>> "reference" >>>>>>>>>> kind of type where Cython usually allow None (i.e., "cdef MyClass >>>>>>>>>> b >>>>>>>>>> = >>>>>>>>>> None" >>>>>>>>>> is allowed even if type(None) is NoneType). Is this a bug or not, >>>>>>>>>> and >>>>>>>>>> is >>>>>>>>>> it >>>>>>>>>> possible to do something about it? >>>>>>>>>> >>>>>>>>>> Dag Sverre >>>>>>>>>> _______________________________________________ >>>>>>>>>> cython-devel mailing list >>>>>>>>>> cython-devel@python.org >>>>>>>>>> http://mail.python.org/mailman/listinfo/cython-devel >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Yeah I disabled that quite early. It was supposed to be working but >>>>>>>>> gave a lot of trouble in cases (segfaults, mainly). At the time I >>>>>>>>> was >>>>>>>>> trying to get rid of all the segfaults and get the basic >>>>>>>>> functionality >>>>>>>>> working, so I disabled it. Personally, I have never liked how >>>>>>>>> things >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Well, you can segfault quite easily with >>>>>>>> >>>>>>>> cdef MyClass a = None >>>>>>>> print a.field >>>>>>>> >>>>>>>> so it doesn't make sense to slices different from cdef classes IMO. >>>>>>>> >>>>>>>> >>>>>>>>> can be None unchecked. I personally prefer to write >>>>>>>>> >>>>>>>>> cdef foo(obj=None): >>>>>>>>> cdef int[:] a >>>>>>>>> if obj is None: >>>>>>>>> obj = ... >>>>>>>>> a = obj >>>>>>>>> >>>>>>>>> Often you forget to write 'not None' when declaring the parameter >>>>>>>>> (and >>>>>>>>> apparently that it only allowed for 'def' functions). >>>>>>>>> >>>>>>>>> As such, I never bothered to re-enable it. However, it does support >>>>>>>>> control flow with uninitialized slices, and will raise an error if >>>>>>>>> it >>>>>>>>> is uninitialized. Do we want this behaviour (e.g. for consistency)? >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> When in doubt, go for consistency. So +1 for that reason. I do >>>>>>>> believe >>>>>>>> that >>>>>>>> setting stuff to None is rather vital in Python. >>>>>>>> >>>>>>>> What I typically do is more like this: >>>>>>>> >>>>>>>> def f(double[:] input, double[:] out=None): >>>>>>>> if out is None: >>>>>>>> out = np.empty_like(input) >>>>>>>> ... >>>>>>>> >>>>>>>> Having to use another variable name is a bit of a pain. (Come on -- >>>>>>>> do >>>>>>>> you >>>>>>>> use "a" in real code? What do you actually call "the other obj"? I >>>>>>>> sometimes >>>>>>>> end up with "out_" and so on, but it creates smelly code quite >>>>>>>> quickly.) >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> No, it was just a contrived example. >>>>>>> >>>>>>>> It's easy to segfault with cdef classes anyway, so decent >>>>>>>> nonechecking >>>>>>>> should be implemented at some point, and then memoryviews would use >>>>>>>> the >>>>>>>> same >>>>>>>> mechanisms. Java has decent null-checking... >>>>>>>> >>>>>>> >>>>>>> The problem with none checking is that it has to occur at every >>>>>>> point. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> Well, using control flow analysis etc. it doesn't really. E.g., >>>>>> >>>>>> for i in range(a.shape[0]): >>>>>> print i >>>>>> a[i] *= 3 >>>>>> >>>>>> can be unrolled and none-checks inserted as >>>>>> >>>>>> print 0 >>>>>> if a is None: raise .... >>>>>> a[0] *= 3 >>>>>> for i in range(1, a.shape[0]): >>>>>> print i >>>>>> a[i] *= 3 # no need for none-check >>>>>> >>>>>> It's very similar to what you'd want to do to pull boundschecking out >>>>>> of >>>>>> the >>>>>> loop... >>>>>> >>>>> >>>>> Oh, definitely. Both optimizations may not always be possible to do, >>>>> though. The optimization (for boundschecking) is easier for prange() >>>>> than range(), as you can immediately raise an exception as the >>>>> exceptional condition may be issued at any iteration. What do you do >>>>> with bounds checking when some accesses are in-bound, and some are >>>>> out-of-bound? Do you immediately raise the exception? Are we fine with >>>>> aborting (like Fortran compilers do when you ask them for bounds >>>>> checking)? And how do you detect that the code doesn't already raise >>>>> an exception or break out of the loop itself to prevent the >>>>> out-of-bound access? (Unless no exceptions are propagating and no >>>>> break/return is used, but exceptions are so very common). >>>>> >>>>>>> With initialized slices the control flow knows when the slices are >>>>>>> initialized, or when they might not be (and it can raise a >>>>>>> compile-time or runtime error, instead of a segfault if you're >>>>>>> lucky). >>>>>>> I'm fine with implementing the behaviour, I just always left it at >>>>>>> the >>>>>>> bottom of my todo list. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> Wasn't saying you should do it, just checking. >>>>>> >>>>>> I'm still not sure about this. I think what I'd really like is >>>>>> >>>>>> a) Stop cdef classes from being None as well >>>>>> >>>>>> b) Sort-of deprecate cdef in favor of cast/assertion type statements >>>>>> that >>>>>> help the type inferences: >>>>>> >>>>>> def f(arr): >>>>>> if arr is None: >>>>>> arr = ... >>>>>> arr = int[:](arr) # equivalent to "cdef int[:] arr = arr", but >>>>>> # acts as statement, with a specific point >>>>>> # for the none-check >>>>>> ... >>>>>> >>>>>> or even: >>>>>> >>>>>> def f(arr): >>>>>> if arr is None: >>>>>> return 'foo' >>>>>> else: >>>>>> arr = int[:](arr) # takes effect *here*, does none-check >>>>>> ... >>>>>> # arr still typed as int[:] here >>>>>> >>>>>> If we can make this work well enough with control flow analysis I'd >>>>>> never >>>>>> cdef declare local vars again :-) >>>>> >>>>> >>>>> >>>>> Hm, what about the following? >>>>> >>>>> def f(arr): >>>>> if arr is None: >>>>> return 'foo' >>>>> >>>>> cdef int[:] arr # arr may not be None >>>> >>>> >>>> >>>> The above would work in general, until the declaration is lexically >>>> encountered, the object is typed as object. >>> >>> >>> >>> This was actually going to be my first proposal :-) That would finally >>> define how "cdef" inside of if-statements etc. behave too (simply use >>> control flow analysis and treat it like a statement). >> >> >> Block-local declarations are definitely something we want, although I >> think it would require some more (non-trivial) changes to the >> compiler. > > > Note that my proposal was actually not about block-local declarations. > > Block-local: > > { > int x = 4; > } > /* x not available here */ > > My idea was much more like hints to control flow analysis. That is, I wanted > to have this raise an error: > > x = 'adf' > if foo(): > cdef int x = y > print x # type of x not known > > This is OK: > > if foo(): > cdef int x = y > else: > cdef int x = 4 > print x # ok, type the same anyway -- so type "escapes" block
Seeing that it doesn't work that way in any language with block scopes, I find that pretty surprising behaviour. Why would you not simply mandate that the user declares 'x' outside of the blocks? > And I would allow > > cdef str x = y > if foo: > cdef int x = int(x) > return g(x) # x must be int > print x # x must be str at this point > > > The reason for this madness is simply that control statements do NOT create > blocks in Python, and making it so in Cython is just confusing. It would > bring too much of C into the language for my taste. And yet it can be very useful and intuitive in several contexts, just not for objects (which aren't typed anyway!). Block-local declarations are useful when a variable is only used in the block and it can be useful to make variables private in the cython.parallel context ("assignment makes private" is really not as intuitive). It's not a very important feature though, and it's indeed more a thing from static languages than Python. > I think that in my Cython-utopia, Symtab.py is only responsible for > resolving the scope of *names*, and types of things are not bound to blocks, > just to the state at control flow points. > > Of course, implementing this would be a nightmare. > > >> Maybe the cleanup code from functions, as well as the temp handling >> etc could be re-factored to a BlockNode, that all block nodes could >> subclass. They'd have to instantiate new symbol table environments as >> well. I'm not yet entirely sure what else would be involved in the >> implementation of that. >> >>> But I like int[:] as a way of making it pure Python syntax compatible as >>> well. Perhaps the two are orthogonal -- a) make variable declaration a >>> statement, b) make cython.int[:](x) do, essentially, a cdef declaration, >>> for >>> Python compatability. >>> >> >> Don't we have cython.declare() for that? e.g. >> >> arr = cython.declare(cython.int[:]) >> >> That would also be treated as a statement like normal declarations (if >> and when implemented). > > > This was what I said, but it wasn't what I meant. Sorry. I'll try to explain > better: > > 1) There's no way to have the above actually do the right thing in Python. > With "arr = cython.int[:](arr)" one could actually return a NumPy or > NumPy-like array that works in Python (since "arr" might not have the > "shape" attribute before the conversion, all we know is that it exports the > buffer interface...). Right, but the same thing goes for other types as well. E.g. I can type something int with cython.declare() and then use strings instead. > 2) I don't like the fact that we overload the assignment operator to acquire > a view. "cdef np.ndarray[int] x = y" is fine since if you do "x.someattr" > then a NumPy subclass could provide someattr and it works fine. Acquiring a > view is just something different. Yeah it's kind of overloaded, but in a good way :) It's the language that does the overloading, which means it's not very surprising. And the memoryview slices coerce to numpy-like (although somewhat incapable) objects and support some of their attributes. I like the simplicity of assignment here, you don't really care that it takes a view, you just want to access and operate on the data. What do you think of allowing the user to register a conversion-to-object function? And perhaps the default should be that if a view was never sliced, it just returns the original object (although that might mean you get back objects with incompatible interfaces...). > 3) Hence I guess I like "arr = int[:](arr)" better both for Cython and > Python; at least if "arr" is always type-inferred to be int[:], even if arr > was an "object" further up in the code (really, if you do "x = f(x)" at the > top-level of the function, then x can just take the identity of another > variable from that point on -- I don't know if the current control flow > analysis and type inferences does this though?) > > > Dag Sverre > _______________________________________________ > cython-devel mailing list > cython-devel@python.org > http://mail.python.org/mailman/listinfo/cython-devel _______________________________________________ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel