#6199: Integer * int is slow
-------------------------------+--------------------------------------------
Reporter: fredrik.johansson | Owner: somebody
Type: defect | Status: new
Priority: major | Milestone: sage-duplicate/invalid/wontfix
Component: basic arithmetic | Resolution:
Keywords: | Author:
Upstream: N/A | Reviewer:
Merged: | Work_issues:
-------------------------------+--------------------------------------------
Changes (by craigcitro):
* status: closed => new
* resolution: wontfix =>
Comment:
Hi Fredrik,
Well, let's not give up hope yet. :) I'll give the full explanation of
what's happening with the coercion system below, but first, I want to ask:
is there any reason you couldn't just switch to using
`sage.integer.Integer` everywhere, or at least in more places? (If you're
worried about modifying existing code, the preparser can do the hard work
for you, and we can definitely adapt it if needed.) Or, in other words,
where are you getting all these Python `int`s from?
Also, I want to note that there's one more reason to not worry so much:
arithmetic on Python `int`s is going to take at least a ~%20 performance
hit in Python 3 (at least for the example at the top of this ticket).
Here's what's happening with this code. Every object that derives from
Sage's `Element` type goes through the coercion system. It does a lot of
things for us: in particular, it makes things like
{{{
sage: R.<x> = ZZ[]
sage: 1/2 + x
1/2 + x
sage: parent(1/2 + x)
Univariate Polynomial Ring in x over Rational Field
}}}
happen without any hard work on the part of the user, and with a minimal
amount of code duplication. So what you're suggesting above does in fact
happen: the `__mul__` code ultimately calls into the `bin_op` method of
the coercion model (which is itself an object, and has lots of useful
methods; try `cm = get_coercion_model()` from the sage prompt), and that
checks a few things in order for a multiplication, say `x*y`:
* are the parents of `x` and `y` identical? If so, call the `_mul_`
method. (This actually happens in `__mul__`.)
* have we saved an action of `x.parent()` on `y.parent()`, or vice-versa?
If so, apply that. If not, see if we can find one, and use it if we can.
* can we coerce `x` and `y` into the same parent? If so, do it and then
call `_mul_` on the result.
* some other stuff.
So what's happening right now is that we're hitting the third bullet, i.e.
we end up doing the coercion of the `int` into `ZZ`, as you noted in the
ticket description. What I did was write some code that added an action of
`int` objects on `ZZ`, which gets picked up in the second bullet above.
This is the "natural" way to add this kind of thing to the coercion model.
The problem is that it's still too slow for what you want: Robert wrote a
custom dictionary for storing the actions, but it's still doing work on
the order of hundreds of nanoseconds to get called and look it up. For
most everything, this isn't so bad, but in your case, it's causing
trouble.
Now, that's saying that the "right" fix, i.e. use the coercion model the
way it's intended, doesn't work out. Of course, that doesn't mean we don't
have other options. I just sat down and hacked together a little code that
short-circuits things earlier on (just before the second bullet above). It
definitely works ... here's before:
{{{
sage: a = 123 ; b = 456 ; c = 456r ; y = polygen(ZZ)
sage: a*c
56088
sage: %timeit a*c
1000000 loops, best of 3: 1.27 us per loop
sage: %timeit a*y
1000000 loops, best of 3: 1.97 us per loop
sage: %timeit a*b
10000000 loops, best of 3: 150 ns per loop
}}}
and after:
{{{
sage: a = 123 ; b = 456 ; c = 456r ; y = polygen(ZZ)
sage: a*c
56088
sage: %timeit a*c
1000000 loops, best of 3: 207 ns per loop
sage: %timeit a*b
10000000 loops, best of 3: 149 ns per loop
sage: %timeit a*y
1000000 loops, best of 3: 1.99 us per loop
}}}
In the case that it's not `Integer * int`, it's one test of a C `int`
being nonzero and between two and four extra pointer comparisons on any
call to `bin_op`. Mind you, that's going to happen for '''every''' binary
arithmetic operation where the parents aren't identical, which is
definitely a hefty price if you live in a world where you don't multiply
`Integer * int` very often. I think we could eliminate the C `int` test,
if we were crafty.
(Side note: I got a ~10% performance regression on the `a*y` line above
between 4.2 and 4.3 -- do either of you have both installed to try it
out?)
Robert, what are your thoughts? I know that "Special cases aren't special
enough to break the rules," and in general I don't like special-casing
things. Plus, it's potentially less significant with Py3 on the horizon.
But if it's really going to be a showstopper for Fredrik, we should at
least think about it more. It's not like there are very many types where
the operations are noticeably faster than a few lookups, so it's not
opening a loophole that could be abused very much, I think.
--
Ticket URL: <http://trac.sagemath.org/sage_trac/ticket/6199#comment:5>
Sage <http://www.sagemath.org>
Sage: Creating a Viable Open Source Alternative to Magma, Maple, Mathematica,
and MATLAB
--
You received this message because you are subscribed to the Google Groups
"sage-trac" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/sage-trac?hl=en.