Re: [Cython] Type inference and memory management

Stefan Behnel Mon, 12 Jul 2010 01:11:22 -0700

Craig Citro, 12.07.2010 08:50:
> I ran into an interesting question while getting Sage to build and
> pass tests against the current cython-devel tip (side note: it does on
> both my laptop and sage.math!)


Ah, finally! Nice!


>      def __repr__(self):
>          cdef char* ss = fmpz_poly_to_string(self.poly)
>          s = ss
>          free(ss)
>          return s

> Now enter type inference. It looks at this block and says, "hey, s is
> only ever assigned to a char * -- let's call it a char *, too." Of
> course, this is a disaster -- it changes the semantics of the
> all-important "s = ss" line. As a result, the return value is (a
> Python copy of) some random junk. This is easy enough to fix -- we can
> be more explicit about our intentions, and declare s to be an object,
> which works great. However, this is likely to break at least some user
> code in the wild -- especially since we've been recommending this as
> the "right" way to do things.

Yes, Robert and I were aware of this at the time. The problem is that the 
above code really *is* broken because it relies on Cython not being smart 
enough to understand what's going on. Code like this needs fixing as early 
as possible, because Cython is only ever going to get smarter and this 
*will* have to break at some point. So we thought it would be better to 
break it early, rather than carrying along some kind of backwards 
compatible quirk to keep this 'working' (which can't be done in most cases 
anyway, without disabling most of the current type inference). Any kind of 
rule that we add here would make things less predictable and less 
understandable.


> 1) Break the code above, tell people to explicitly declare things to be 
> objects.

Or use a cast, yes. That's my favourite.


> 2) Decide that if a variable gets returned by a function which is
> either a def'd function or returns a Python object, and we don't have
> an explicit type declaration already, then we only infer something
> which is a subtype of Python object.

That would be an insufficient and rather hard to explain rule. It basically 
says: if Cython happens to be smart enough to figure out that the value 
will end up being returned from the function, the code will behave 
different than in the case that it fails to trace the value propagation. 
IMHO, much worse than the current behaviour.


> (Right now, we almost never infer
> anything more specific than Python object anyway.)

Well, yes, "almost". It's already false in some important cases and those 
cases are very likely to grow in the future.


> (1) is going to generate (potentially)
> faster code, but (2) is going to be much friendlier for someone
> migrating Python code.

No. Someone who migrates Python code will not start from code that relies 
on the above anyway. Remember that this only happens with pointers, so pure 
Python code is not affected.

Stefan
_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev

Re: [Cython] Type inference and memory management

Reply via email to