Stefan Behnel, 12.11.2010 16:24:
> one of the CPython regression tests (test_long_future in Py2.7) failed
> because it used the constant expression "1L<<  40000". We had this problem
> before, Cython currently calculates the result in the compiler and writes
> it literally into the C source. When I disable the folding for constants of
> that size, it actually writes "PyInt_FromLong(1L<<  40000)", which is not a
> bit better.
>
> I found this old thread related to this topic but not much more
>
> http://comments.gmane.org/gmane.comp.python.cython.devel/2449
>
> The main problem here is that we cannot make hard assumptions about the
> target storage type in C. We currently assume (more or less) that a 'long'
> is at least 32bit, but if it happens to be 64bit, it can hold much larger
> constants natively, and we can't know that at code generation time. So our
> best bet is to play safe and use Python computation for things that may not
> necessarily fit the target type. And, yes, my fellow friends of the math,
> this implies a major performance regression in the case that Cython cannot
> know that it actually will fit at C compilation time.
>
> However, instead of changing the constant folding here, I think it would be
> better to implement type inference for integer literals. It can try to find
> a suitable type for a (folded or original) literal, potentially suggesting
> PyLong if we think there isn't a C type to handle it.
>
> The main problem with this approach is that disabling type inference
> explicitly will bring code back to suffering from the above problem, which
> would surely be unexpected for users. So we might have to implement
> something similar at least for the type coercion of integer literals (to
> change literals into PyLong if a large constant coerces to a Python type).
>
> Does this make sense? Any better ideas?

I opened a ticket for this:

http://trac.cython.org/cython_trac/ticket/592

I also have an "almost ready" solution that basically leaves obvious C 
literals (enum names or numeric literals ending with "U" or "LL") unchanged 
and only checks that Python compatible literals are <= 32bit. If they are 
not, they are promoted to Python objects. Here's the code:

     def find_suitable_type_for_value(self):
         if self.constant_result is constant_value_not_set:
             try:
                 self.calculate_constant_result()
             except ValueError:
                 pass
         if self.constant_result in (constant_value_not_set,
                                     not_a_constant) or \
                self.unsigned or self.longness == 'LL':
             # clearly a C literal (including enum names etc.)
             rank = (self.longness == 'LL') and 2 or 1
             suitable_type = PyrexTypes.modifiers_and_name_to_type[
                                        not self.unsigned, rank, "int"]
             if self.type:
                 suitable_type = PyrexTypes.widest_numeric_type(
                     suitable_type, self.type)
         else:
             # C literal or Python literal - split at 32bit boundary
             if self.constant_result >= -2**31 and \
                        self.constant_result < 2**31:
                 if self.type and self.type.is_int:
                     suitable_type = self.type
                 else:
                     suitable_type = PyrexTypes.c_long_type
             else:
                 suitable_type = PyrexTypes.py_object_type
         return suitable_type

Comments on the above so far?

I also reworked the constant folding code quite a bit to make it work more 
nicely with changes like the above.

Would a change like this be suitable for 0.13.1 or rather 0.14?

Stefan
_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev

Reply via email to