On Fri, Nov 12, 2010 at 7:24 AM, Stefan Behnel <[email protected]> wrote:
> Hi,
>
> one of the CPython regression tests (test_long_future in Py2.7) failed
> because it used the constant expression "1L << 40000". We had this problem
> before, Cython currently calculates the result in the compiler and writes
> it literally into the C source. When I disable the folding for constants of
> that size, it actually writes "PyInt_FromLong(1L << 40000)", which is not a
> bit better.
>
> I found this old thread related to this topic but not much more
>
> http://comments.gmane.org/gmane.comp.python.cython.devel/2449
>
> The main problem here is that we cannot make hard assumptions about the
> target storage type in C. We currently assume (more or less) that a 'long'
> is at least 32bit, but if it happens to be 64bit, it can hold much larger
> constants natively, and we can't know that at code generation time. So our
> best bet is to play safe and use Python computation for things that may not
> necessarily fit the target type. And, yes, my fellow friends of the math,
> this implies a major performance regression in the case that Cython cannot
> know that it actually will fit at C compilation time.
>
> However, instead of changing the constant folding here, I think it would be
> better to implement type inference for integer literals. It can try to find
> a suitable type for a (folded or original) literal, potentially suggesting
> PyLong if we think there isn't a C type to handle it.
>
> The main problem with this approach is that disabling type inference
> explicitly will bring code back to suffering from the above problem, which
> would surely be unexpected for users. So we might have to implement
> something similar at least for the type coercion of integer literals (to
> change literals into PyLong if a large constant coerces to a Python type).
>
> Does this make sense? Any better ideas?
I remember talking with Craig Citro about this earlier this year, and
choosing an arbitrary cutoff for inference of literals had odd
side-effects. (Perhaps most of this was due to trying to accommodate
arithmetic.) I think if the user writes
cdef long a = 1 << 47
this should emit pure C, but if they write
a = 1 << 47
then I'm OK with infering this to be a Python object (which, I should
note, it currently does, though as a side note I'm not convinced we
should but such *huge* pre-computed literals in the source code...).
Perhaps this is just about intermediate results, which is much
trickier. What was the exact code snippet?
- Robert
_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev