On Fri, Nov 12, 2010 at 2:20 PM, Stefan Behnel <[email protected]> wrote: > Stefan Behnel, 12.11.2010 21:36: >> Stefan Behnel, 12.11.2010 16:24: >>> one of the CPython regression tests (test_long_future in Py2.7) failed >>> because it used the constant expression "1L<< 40000". We had this problem >>> before, Cython currently calculates the result in the compiler and writes >>> it literally into the C source. When I disable the folding for constants of >>> that size, it actually writes "PyInt_FromLong(1L<< 40000)", which is not a >>> bit better. >>> >>> I found this old thread related to this topic but not much more >>> >>> http://comments.gmane.org/gmane.comp.python.cython.devel/2449 >>> >>> The main problem here is that we cannot make hard assumptions about the >>> target storage type in C. We currently assume (more or less) that a 'long' >>> is at least 32bit, but if it happens to be 64bit, it can hold much larger >>> constants natively, and we can't know that at code generation time. So our >>> best bet is to play safe and use Python computation for things that may not >>> necessarily fit the target type. And, yes, my fellow friends of the math, >>> this implies a major performance regression in the case that Cython cannot >>> know that it actually will fit at C compilation time. >>> >>> However, instead of changing the constant folding here, I think it would be >>> better to implement type inference for integer literals. It can try to find >>> a suitable type for a (folded or original) literal, potentially suggesting >>> PyLong if we think there isn't a C type to handle it. >>> >>> The main problem with this approach is that disabling type inference >>> explicitly will bring code back to suffering from the above problem, which >>> would surely be unexpected for users. So we might have to implement >>> something similar at least for the type coercion of integer literals (to >>> change literals into PyLong if a large constant coerces to a Python type). >>> >>> Does this make sense? Any better ideas? >> >> I opened a ticket for this: >> >> http://trac.cython.org/cython_trac/ticket/592 >> >> I also have an "almost ready" solution that basically leaves obvious C >> literals (enum names or numeric literals ending with "U" or "LL") unchanged >> and only checks that Python compatible literals are<= 32bit. If they are >> not, they are promoted to Python objects. Here's the code: >> >> def find_suitable_type_for_value(self): >> if self.constant_result is constant_value_not_set: >> try: >> self.calculate_constant_result() >> except ValueError: >> pass >> if self.constant_result in (constant_value_not_set, >> not_a_constant) or \ >> self.unsigned or self.longness == 'LL': >> # clearly a C literal (including enum names etc.) >> rank = (self.longness == 'LL') and 2 or 1 >> suitable_type = PyrexTypes.modifiers_and_name_to_type[ >> not self.unsigned, rank, "int"] >> if self.type: >> suitable_type = PyrexTypes.widest_numeric_type( >> suitable_type, self.type) >> else: >> # C literal or Python literal - split at 32bit boundary >> if self.constant_result>= -2**31 and \ >> self.constant_result< 2**31: >> if self.type and self.type.is_int: >> suitable_type = self.type >> else: >> suitable_type = PyrexTypes.c_long_type >> else: >> suitable_type = PyrexTypes.py_object_type >> return suitable_type >> >> Comments on the above so far? >> >> I also reworked the constant folding code quite a bit to make it work more >> nicely with changes like the above. > > I pushed several constant folding improvements in separate commits. The > code above is included here: > > http://hg.cython.org/cython-devel/rev/ec85b2d8aefb > > That's the one that I consider worth some serious testing because it may > have a behavioural impact on code. It should be the only one to roll back > if this change is not accepted.
Too bad Sage just recently died in the build bot. I'll try to see what's up with that. > There's also the follow-up commit for the constant folding transformation > > http://hg.cython.org/cython-devel/rev/346813017914 > > but I think that should be mostly safe, even without the preceding change. > > >> Would a change like this be suitable for 0.13.1 or rather 0.14? I'd say so, but I'm not 100% sure. Could you be more precise about what changed? Before, if you had something like a = 1000000000000000000 b = 1 << 100 it worked just fine. It's clear you did some major re-writing here though. Would it be fair to say the (post-changes) semantics of integer constants (literals or compile time expressions) are that they are C ints if they are less than 32 bits, and Python object otherwise? In both cases coercible to the other depending on the context (e.g. assignment to a cdef'd variable)? - Robert _______________________________________________ Cython-dev mailing list [email protected] http://codespeak.net/mailman/listinfo/cython-dev
