Re: [sympy] Lambdify - performance using sympy as the module

Oscar Benjamin Wed, 15 Sep 2021 06:14:26 -0700

On Wed, 15 Sept 2021 at 13:41, Oscar Benjamin <[email protected]>
wrote:


> On Tue, 14 Sept 2021 at 23:12, [email protected] <
> [email protected]> wrote:
>
>> Hello,
>> let's say I'd like to numerically evaluate a single sympy function over
>> an array using sympy as the module. Curiously, passing in regular Python's
>> float numbers makes the evaluation much faster then passing in Sympy's
>> Float instances. I tried several sympy functions, they tend to follow this
>> trend.
>>
>
> The 3 millisecond timing difference that you are asking about here is
> dwarfed by the actual 1 second time that it really takes to compute this
> result the first time. Most likely the time differences you see are just to
> do with exactly how efficient the cache lookups are for different types.
>

If you want to see what is taking the time then use a profiler. From
isympy/ipython you can use %prun or if you want to run separate processes
you can use python -m cProfile but that will also time how long it takes to
import sympy so it's only useful for things that take at least several
seconds.

In the first run the time is taken by sin.eval and most of that by the
extract_multiplicatively function.

In [2]: %prun -s cumulative f(domain)

         2069604 function calls (2047615 primitive calls) in 1.585 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    1.585    1.585 {built-in method
builtins.exec}
        1    0.001    0.001    1.585    1.585 <string>:1(<module>)
     1000    0.002    0.000    1.584    0.002
<lambdifygenerated-1>:1(_lambdifygenerated)
16989/1000    0.022    0.000    1.582    0.002 cache.py:69(wrapper)
     1000    0.014    0.000    1.579    0.002 function.py:450(__new__)
     1000    0.013    0.000    1.392    0.001 function.py:270(__new__)
     1000    0.012    0.000    1.270    0.001 trigonometric.py:266(eval)
     2997    0.029    0.000    0.988    0.000
expr.py:2183(extract_multiplicatively)
     7993    0.020    0.000    0.835    0.000 assumptions.py:460(getit)
     8993    0.336    0.000    0.722    0.000 facts.py:499(deduce_all_facts)
6993/2997    0.007    0.000    0.529    0.000
decorators.py:88(__sympifyit_wrapper)
     2997    0.010    0.000    0.525    0.000 numbers.py:1315(__truediv__)
      999    0.005    0.000    0.511    0.001
expr.py:2453(could_extract_minus_sign)
      999    0.001    0.000    0.505    0.001 expr.py:1662(as_coefficient)
      999    0.001    0.000    0.480    0.000 numbers.py:759(__truediv__)
      999    0.001    0.000    0.478    0.000 decorators.py:254(_func)
      999    0.001    0.000    0.476    0.000
decorators.py:129(binary_op_wrapper)
      999    0.003    0.000    0.474    0.000 expr.py:260(__truediv__)
     4996    0.015    0.000    0.459    0.000 assumptions.py:472(_ask)
      999    0.015    0.000    0.457    0.000 operations.py:46(__new__)
      999    0.037    0.000    0.408    0.000 mul.py:178(flatten)
     3997    0.005    0.000    0.357    0.000 assumptions.py:444(copy)
     3997    0.014    0.000    0.352    0.000 assumptions.py:432(__init__)
   198823    0.074    0.000    0.254    0.000 {built-in method builtins.all}
   570492    0.158    0.000    0.209    0.000 facts.py:533(<genexpr>)
     1999    0.032    0.000    0.126    0.000 sets.py:1774(__new__)
   183867    0.083    0.000    0.083    0.000 facts.py:482(_tell)
      999    0.007    0.000    0.078    0.000 evalf.py:1425(evalf)
    15987    0.023    0.000    0.077    0.000 sympify.py:92(sympify)
     1000    0.003    0.000    0.066    0.000 function.py:214(nargs)
     7996    0.026    0.000    0.065    0.000 compatibility.py:501(ordered)
     6994    0.015    0.000    0.063    0.000 numbers.py:1197(_new)
    29977    0.020    0.000    0.057    0.000 <frozen
importlib._bootstrap>:1009(_handle_fromlist)
   391655    0.054    0.000    0.054    0.000 {method 'get' of 'dict'
objects}
 1998/999    0.007    0.000    0.053    0.000 evalf.py:1332(evalf)
     1998    0.053    0.000    0.053    0.000 mul.py:449(_gather)
      999    0.005    0.000    0.046    0.000 evalf.py:781(evalf_trig)
   ...

Now a second run in the same process just shows 1000 cache lookups:

In [3]: %prun -s cumulative f(domain)

         2003 function calls in 0.002 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.002    0.002 {built-in method
builtins.exec}
        1    0.001    0.001    0.002    0.002 <string>:1(<module>)
     1000    0.001    0.000    0.001    0.000
<lambdifygenerated-1>:1(_lambdifygenerated)
     1000    0.001    0.000    0.001    0.000 cache.py:69(wrapper)
        1    0.000    0.000    0.000    0.000 {method 'disable' of
'_lsprof.Profiler' objects}

This is the second run when the inputs are Expr:

In [5]: %prun -s cumulative f(domain_sympy)
         5003 function calls in 0.014 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.014    0.014 {built-in method
builtins.exec}
        1    0.007    0.007    0.014    0.014 <string>:1(<module>)
     1000    0.002    0.000    0.008    0.000
<lambdifygenerated-1>:1(_lambdifygenerated)
     1000    0.002    0.000    0.006    0.000 cache.py:69(wrapper)
     1000    0.001    0.000    0.004    0.000 numbers.py:1480(__hash__)
     1000    0.001    0.000    0.002    0.000 numbers.py:806(__hash__)
     1000    0.001    0.000    0.001    0.000 expr.py:126(__hash__)
        1    0.000    0.000    0.000    0.000 {method 'disable' of
'_lsprof.Profiler' objects}

The cache lookups here are slower because the pure Python Expr.__hash__
methods are slower than for numpy's float64.__hash__ which is implemented
in C.

Either way this shows that the differences are all just to do with caching
and are not representative of the actual time that it would take to compute
these results once. The bulk of the time in the first run is ultimately
consumed by deduce_all_facts i.e. the old assumptions. Note that the old
assumptions are themselves cached on each Basic instance separately from
the main cache which can also affect timing results if you don't use a
separate process for timings:

In [1]: e = sin(1)

In [2]: e._assumptions
Out[2]: {}

In [3]: e.is_real
Out[3]: True

In [4]: e._assumptions
Out[4]:
{'real': True,
 'positive': True,
 'finite': True,
 'infinite': False,
 'extended_positive': True,
 'extended_real': True,
 'commutative': True,
 'imaginary': False,
 'hermitian': True,
 'complex': True,
 'extended_nonnegative': True,
 'nonpositive': False,
 'negative': False,
 'extended_nonpositive': False,
 'extended_nonzero': True,
 'extended_negative': False,
 'zero': False,
 'nonnegative': True,
 'nonzero': True}

The deduce_all_facts function is the one that applies all of the
implications relating these different assumption predicates so that as soon
as one predicate (e.g. positive=True) is known then all of the others can
be stored in the _assumptions dict. Although this makes things faster on a
second assumptions query it does in fact consume the bulk of the time for
computing many things in SymPy.

Assumptions queries that take place during automatic evaluation are often a
cause of slowness in SymPy. The bottom of the profile report shows evalf
being called 2000 times i.e. twice per evaluation of the sin function. The
evalf calls are part of the assumptions query and are reasonably fast for a
simple sin(float) but can be very expensive for large expressions.


--
Oscar

-- 
You received this message because you are subscribed to the Google Groups 
"sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sympy/CAHVvXxRmA9AYL3TnAJeM_crc%2BKc9kb3ADtLR%2BptZEQKYPbKspw%40mail.gmail.com.

Re: [sympy] Lambdify - performance using sympy as the module

Reply via email to