#20681: Cythonize the special methods in the categories that handle coercion in
arithmetic
-------------------------------------+-------------------------------------
       Reporter:  nthiery            |        Owner:
           Type:  enhancement        |       Status:  needs_work
       Priority:  major              |    Milestone:  sage-7.3
      Component:  categories         |   Resolution:
       Keywords:  performance        |    Merged in:
        Authors:  Nicolas M. Thiéry  |    Reviewers:  Travis Scrimshaw
Report Upstream:  N/A                |  Work issues:
         Branch:                     |       Commit:
  u/nthiery/cythonize_arithmetic_special_methods_in_the_categories|  
0698cf523f274626f4f1eabcd0dc224c623e3f47
   Dependencies:                     |     Stopgaps:
-------------------------------------+-------------------------------------
Description changed by nthiery:

Old description:

> Recall that the Sage categories provide implementations for Python's
> special methods for the basic arithmetic operations (e.g. `__mul__`,
> `__add__`), that dispatch to the coercion model or call Sage's special
> methods (e.g. `_mul_`, `_add_`) for the internal
> operations. Furthermore, the default implementation for `_mul_` is to
> delegate the work to the `product` method of the parent.
>
> Before this ticket, those indirections induced an overhead of roughly
> 100ns for `*`, 480ns for `__mul__` and 340ns for `_mul_`:
> {{{
>     sage: S = Semigroups().example()
>     sage: x = S.an_element()
>     sage: x*x
>     42
>     sage: sage: %timeit [x*x for i in range(1000)]
>     1000 loops, best of 3: 1.21 ms per loop
>     sage: sage: %timeit [x.__mul__(x) for i in range(1000)]
>     1000 loops, best of 3: 1.12 ms per loop
>     sage: sage: %timeit [x._mul_(x) for i in range(1000)]
>     1000 loops, best of 3: 643 µs per loop
>     sage: %timeit [S.product(x,x) for i in range(1000)]
>     1000 loops, best of 3: 300 µs per loop
> }}}
>
> After this ticket, which Cythonize those methods, the overhead is
> reduced to roughly 20ns for `*`, 100ns for `__mul__` and 160ms for
> `_mul_`, which is a speedup by a factor of 3 on the accumulated
> overhead, and an overall speedup by a factor of 2 for this particular
> parent:
> {{{
>     sage: %timeit [x*x for i in range(1000)]
>     1000 loops, best of 3: 596 µs per loop
>     sage: %timeit [x.__mul__(x) for i in range(1000)]
>     1000 loops, best of 3: 572 µs per loop
>     sage: %timeit [x._mul_(x) for i in range(1000)]
>     1000 loops, best of 3: 467 µs per loop
>     sage: %timeit [S.product(x,x) for i in range(1000)]
>     1000 loops, best of 3: 300 µs per loop
> }}}
>
> Notes
> =====
>
> The average timings vary by about 5-10% from one run to the other.
>
> In this parent, the product is trivial, and implemented as a plain
> Python `product` method in the parent. For a parent with a non trivial
> Python product, the overhead of the indirections is now of 20%, which
> is getting closer to acceptable::
> {{{
>     sage: S = Semigroups().Finite().example()
>     sage: x = S.an_element()
>     sage: %timeit [S.product(x,x) for i in range(1000)]
>     100 loops, best of 3: 4.39 ms per loop
>     sage: %timeit [x*x for i in range(1000)]
>     100 loops, best of 3: 5.28 ms per loop
> }}}
>
> For comparison, here is the cost of the product of two (Sage)
> integers::
> {{{
>     sage: x = 3
>     sage: %timeit [x*x for i in range(1000)]
>     1000 loops, best of 3: 162 µs per loop
>     sage: x = int(3)
>     sage: %timeit [x*x for i in range(1000)]
>     10000 loops, best of 3: 63.9 µs per loop
> }}}
>
> So the 120ns overhead when using coercion methods `__mul__` provided
> by categories (rather than from a Cython super class) is of the same
> order of magnitude as the cost of a trivial multiplication.

New description:

 Recall that the Sage categories provide implementations for Python's
 special methods for the basic arithmetic operations (e.g. `__mul__`,
 `__add__`), that dispatch to the coercion model or call Sage's special
 methods (e.g. `_mul_`, `_add_`) for the internal
 operations. Furthermore, the default implementation for `_mul_` is to
 delegate the work to the `product` method of the parent.

 Before this ticket, those indirections induced an overhead of roughly
 100ns for `*`, 480ns for `__mul__` and 340ns for `_mul_`:
 {{{
     sage: S = Semigroups().example()
     sage: x = S.an_element()
     sage: x*x
     42
     sage: sage: %timeit [x*x for i in range(1000)]
     1000 loops, best of 3: 1.21 ms per loop
     sage: sage: %timeit [x.__mul__(x) for i in range(1000)]
     1000 loops, best of 3: 1.12 ms per loop
     sage: sage: %timeit [x._mul_(x) for i in range(1000)]
     1000 loops, best of 3: 643 µs per loop
     sage: %timeit [S.product(x,x) for i in range(1000)]
     1000 loops, best of 3: 300 µs per loop
 }}}

 After this ticket, which Cythonize those methods, the overhead is
 reduced to roughly 20ns for `*`, 100ns for `__mul__` and 160ms for
 `_mul_`, which is a speedup by a factor of 3 on the accumulated
 overhead, and an overall speedup by a factor of 2 for this particular
 parent:
 {{{
     sage: %timeit [x*x for i in range(1000)]
     1000 loops, best of 3: 596 µs per loop
     sage: %timeit [x.__mul__(x) for i in range(1000)]
     1000 loops, best of 3: 572 µs per loop
     sage: %timeit [x._mul_(x) for i in range(1000)]
     1000 loops, best of 3: 467 µs per loop
     sage: %timeit [S.product(x,x) for i in range(1000)]
     1000 loops, best of 3: 300 µs per loop
 }}}

 *Notes*

 The average timings vary by about 5-10% from one run to the other.

 In this parent, the product is trivial, and implemented as a plain
 Python `product` method in the parent. For a parent with a non trivial
 Python product, the overhead of the indirections is now of 20%, which
 is getting closer to acceptable::
 {{{
     sage: S = Semigroups().Finite().example()
     sage: x = S.an_element()
     sage: %timeit [S.product(x,x) for i in range(1000)]
     100 loops, best of 3: 4.39 ms per loop
     sage: %timeit [x*x for i in range(1000)]
     100 loops, best of 3: 5.28 ms per loop
 }}}

 For comparison, here is the cost of the product of two (Sage)
 integers::
 {{{
     sage: x = 3
     sage: %timeit [x*x for i in range(1000)]
     1000 loops, best of 3: 162 µs per loop
     sage: x = int(3)
     sage: %timeit [x*x for i in range(1000)]
     10000 loops, best of 3: 63.9 µs per loop
 }}}

 So the 120ns overhead when using coercion methods `__mul__` provided
 by categories (rather than from a Cython super class) is of the same
 order of magnitude as the cost of a trivial multiplication.

--

--
Ticket URL: <http://trac.sagemath.org/ticket/20681#comment:8>
Sage <http://www.sagemath.org>
Sage: Creating a Viable Open Source Alternative to Magma, Maple, Mathematica, 
and MATLAB

-- 
You received this message because you are subscribed to the Google Groups 
"sage-trac" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/sage-trac.
For more options, visit https://groups.google.com/d/optout.

Reply via email to