Comment #48 on issue 2607 by asmeurer: as_numer_denom() is too slow
http://code.google.com/p/sympy/issues/detail?id=2607
Here are timings for the many denoms few numers idea from comment 19. All
timings (obviously) done in the and branch:
Cache on:
In [1]: numers, denoms = zip(*((Symbol('n%d'%i),Symbol('d%d'%i)) for i in
xrange(100)))
In [2]: a = Add(*(n/d for n in numers[:10] for d in denoms))
In [3]: %timeit a.as_numer_denom1()
^C
KeyboardInterrupt: # Took too long
In [4]: %timeit a.as_numer_denom2()
1 loops, best of 3: 1.98 s per loop
In [5]: %timeit a.as_numer_denom3()
1 loops, best of 3: 163 ms per loop
In [6]: %timeit a.as_numer_denom4()
1 loops, best of 3: 120 ms per loop
In [7]: %timeit a.as_numer_denom4b()
10 loops, best of 3: 170 ms per loop
In [8]: %timeit a.as_numer_denom5()
1 loops, best of 3: 82.6 ms per loop
In [9]: %timeit a.as_numer_denom6()
1 loops, best of 3: 234 ms per loop
In [10]: %timeit a.as_numer_denom7()
1 loops, best of 3: 82.1 ms per loop
In [11]: %timeit a.as_numer_denom8()
1 loops, best of 3: 100 ms per loop
In [12]: %timeit together(a)
1 loops, best of 3: 4.73 s per loop
Cache off:
In [1]: numers, denoms = zip(*((Symbol('n%d'%i),Symbol('d%d'%i)) for i in
xrange(100)))
In [2]: a = Add(*(n/d for n in numers[:10] for d in denoms))
In [3]: %timeit a.as_numer_denom2()
1 loops, best of 3: 21.5 s per loop
In [4]: %timeit a.as_numer_denom3()
1 loops, best of 3: 16 s per loop
In [5]: %timeit a.as_numer_denom4()
1 loops, best of 3: 28.1 s per loop
In [6]: %timeit a.as_numer_denom4b()
1 loops, best of 3: 22.9 s per loop
In [7]: %timeit a.as_numer_denom5()
1 loops, best of 3: 1.46 s per loop
In [8]: %timeit a.as_numer_denom6()
1 loops, best of 3: 353 ms per loop
In [9]: %timeit a.as_numer_denom7()
1 loops, best of 3: 461 ms per loop
In [10]: %timeit a.as_numer_denom8()
1 loops, best of 3: 3.87 s per loop
In [11]: %timeit together(a)
1 loops, best of 3: 14 s per loop
What can we surmise from this? First, that all implementations are hitting
some cache intensive parts of the code---some more than others. Second,
your variants are better than any of mine or together in any case.
There's more to the story, though. as_numer_denom7(), which appears to be
one of the fastest methods, returns a nested expression, which must be
denested in many cases to do anything useful with it (e.g., to make a Poly
out of it). This takes more time, and I think it's safe to assume that it
would make it slower than most other contending methods if we tacked on
that time. However, it's not as nested as the output of
as_numer_denom3(). Just how nested is it? The depth of the output of
as_numer_denom3() is equal to the number of terms. as_numer_denom7() seems
to be more like the log of the number of terms, which is actually
reasonable (for 100 terms, the depth is only 7).
So how about the dual test, many numers and few denoms? This will rely on
a good common denominator optimization.
Cache on:
In [1]: numers, denoms = zip(*((Symbol('n%d'%i),Symbol('d%d'%i)) for i in
xrange(100)))
In [2]: a = Add(*(n/d for n in numers for d in denoms[:10]))
In [3]: %timeit a.as_numer_denom1()
^C
KeyboardInterrupt:
In [4]: %timeit a.as_numer_denom2()
1 loops, best of 3: 992 ms per loop
In [5]: %timeit a.as_numer_denom3()
1 loops, best of 3: 93.1 ms per loop
In [6]: %timeit a.as_numer_denom4()
1 loops, best of 3: 85.7 ms per loop
In [7]: %timeit a.as_numer_denom4b()
10 loops, best of 3: 104 ms per loop
In [8]: %timeit a.as_numer_denom5()
10 loops, best of 3: 55.8 ms per loop
In [9]: %timeit a.as_numer_denom6()
10 loops, best of 3: 56.5 ms per loop
In [10]: %timeit a.as_numer_denom7()
10 loops, best of 3: 55.8 ms per loop
In [11]: %timeit a.as_numer_denom8()
1 loops, best of 3: 53 ms per loop
In [12]: %timeit together(a)
1 loops, best of 3: 398 ms per loop
Cache off:
In [2]: a = Add(*(n/d for n in numers for d in denoms[:10]))
In [3]: %timeit a.as_numer_denom2()
1 loops, best of 3: 11.1 s per loop
In [4]: %timeit a.as_numer_denom3()
1 loops, best of 3: 1.68 s per loop
In [5]: %timeit a.as_numer_denom4()
1 loops, best of 3: 1.59 s per loop
In [6]: %timeit a.as_numer_denom4b()
1 loops, best of 3: 1.58 s per loop
In [7]: %timeit a.as_numer_denom5()
1 loops, best of 3: 189 ms per loop
In [8]: %timeit a.as_numer_denom6()
10 loops, best of 3: 248 ms per loop
In [9]: %timeit a.as_numer_denom7()
1 loops, best of 3: 192 ms per loop
In [10]: %timeit a.as_numer_denom8()
1 loops, best of 3: 197 ms per loop
In [11]: %timeit together(a)
1 loops, best of 3: 1.31 s per loop
I'd say this doesn't tell us anything new.
Now, we should make as_numer_denom9, 10, ... based on 5-8 except it does
structural gcd like together() does (see comment 29), so we can see if this
makes things faster or slower.
(sorry for the long comments of timings, but I think it's important to
document this)
--
You received this message because you are subscribed to the Google Groups
"sympy-issues" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/sympy-issues?hl=en.