#6199: Integer * int is slow
-------------------------------+--------------------------------------------
 Reporter:  fredrik.johansson  |         Owner:  somebody                      
     Type:  defect             |        Status:  new                           
 Priority:  major              |     Milestone:  sage-duplicate/invalid/wontfix
Component:  basic arithmetic   |    Resolution:                                
 Keywords:                     |        Author:                                
 Upstream:  N/A                |      Reviewer:                                
   Merged:                     |   Work_issues:                                
-------------------------------+--------------------------------------------
Changes (by craigcitro):

  * status:  closed => new
  * resolution:  wontfix =>


Comment:

 Hi Fredrik,

 Well, let's not give up hope yet. :) I'll give the full explanation of
 what's happening with the coercion system below, but first, I want to ask:
 is there any reason you couldn't just switch to using
 `sage.integer.Integer` everywhere, or at least in more places? (If you're
 worried about modifying existing code, the preparser can do the hard work
 for you, and we can definitely adapt it if needed.) Or, in other words,
 where are you getting all these Python `int`s from?

 Also, I want to note that there's one more reason to not worry so much:
 arithmetic on Python `int`s is going to take at least a ~%20 performance
 hit in Python 3 (at least for the example at the top of this ticket).

 Here's what's happening with this code. Every object that derives from
 Sage's `Element` type goes through the coercion system. It does a lot of
 things for us: in particular, it makes things like
 {{{
 sage: R.<x> = ZZ[]
 sage: 1/2 + x
 1/2 + x
 sage: parent(1/2 + x)
 Univariate Polynomial Ring in x over Rational Field
 }}}
 happen without any hard work on the part of the user, and with a minimal
 amount of code duplication. So what you're suggesting above does in fact
 happen: the `__mul__` code ultimately calls into the `bin_op` method of
 the coercion model (which is itself an object, and has lots of useful
 methods; try `cm = get_coercion_model()` from the sage prompt), and that
 checks a few things in order for a multiplication, say `x*y`:

  * are the parents of `x` and `y` identical? If so, call the `_mul_`
 method. (This actually happens in `__mul__`.)
  * have we saved an action of `x.parent()` on `y.parent()`, or vice-versa?
 If so, apply that. If not, see if we can find one, and use it if we can.
  * can we coerce `x` and `y` into the same parent? If so, do it and then
 call `_mul_` on the result.
  * some other stuff.

 So what's happening right now is that we're hitting the third bullet, i.e.
 we end up doing the coercion of the `int` into `ZZ`, as you noted in the
 ticket description. What I did was write some code that added an action of
 `int` objects on `ZZ`, which gets picked up in the second bullet above.
 This is the "natural" way to add this kind of thing to the coercion model.
 The problem is that it's still too slow for what you want: Robert wrote a
 custom dictionary for storing the actions, but it's still doing work on
 the order of hundreds of nanoseconds to get called and look it up. For
 most everything, this isn't so bad, but in your case, it's causing
 trouble.

 Now, that's saying that the "right" fix, i.e. use the coercion model the
 way it's intended, doesn't work out. Of course, that doesn't mean we don't
 have other options. I just sat down and hacked together a little code that
 short-circuits things earlier on (just before the second bullet above). It
 definitely works ... here's before:
 {{{
 sage: a = 123 ; b = 456 ; c = 456r ; y = polygen(ZZ)
 sage: a*c
 56088
 sage: %timeit a*c
 1000000 loops, best of 3: 1.27 us per loop
 sage: %timeit a*y
 1000000 loops, best of 3: 1.97 us per loop
 sage: %timeit a*b
 10000000 loops, best of 3: 150 ns per loop
 }}}
 and after:
 {{{
 sage: a = 123 ; b = 456 ; c = 456r ; y = polygen(ZZ)
 sage: a*c
 56088
 sage: %timeit a*c
 1000000 loops, best of 3: 207 ns per loop
 sage: %timeit a*b
 10000000 loops, best of 3: 149 ns per loop
 sage: %timeit a*y
 1000000 loops, best of 3: 1.99 us per loop
 }}}
 In the case that it's not `Integer * int`, it's one test of a C `int`
 being nonzero and between two and four extra pointer comparisons on any
 call to `bin_op`. Mind you, that's going to happen for '''every''' binary
 arithmetic operation where the parents aren't identical, which is
 definitely a hefty price if you live in a world where you don't multiply
 `Integer * int` very often. I think we could eliminate the C `int` test,
 if we were crafty.

 (Side note: I got a ~10% performance regression on the `a*y` line above
 between 4.2 and 4.3 -- do either of you have both installed to try it
 out?)

 Robert, what are your thoughts? I know that "Special cases aren't special
 enough to break the rules," and in general I don't like special-casing
 things. Plus, it's potentially less significant with Py3 on the horizon.
 But if it's really going to be a showstopper for Fredrik, we should at
 least think about it more. It's not like there are very many types where
 the operations are noticeably faster than a few lookups, so it's not
 opening a loophole that could be abused very much, I think.

-- 
Ticket URL: <http://trac.sagemath.org/sage_trac/ticket/6199#comment:5>
Sage <http://www.sagemath.org>
Sage: Creating a Viable Open Source Alternative to Magma, Maple, Mathematica, 
and MATLAB

-- 
You received this message because you are subscribed to the Google Groups 
"sage-trac" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/sage-trac?hl=en.

Reply via email to