On Wed, 15 Sept 2021 at 13:41, Oscar Benjamin <[email protected]> wrote:
> On Tue, 14 Sept 2021 at 23:12, [email protected] < > [email protected]> wrote: > >> Hello, >> let's say I'd like to numerically evaluate a single sympy function over >> an array using sympy as the module. Curiously, passing in regular Python's >> float numbers makes the evaluation much faster then passing in Sympy's >> Float instances. I tried several sympy functions, they tend to follow this >> trend. >> > > The 3 millisecond timing difference that you are asking about here is > dwarfed by the actual 1 second time that it really takes to compute this > result the first time. Most likely the time differences you see are just to > do with exactly how efficient the cache lookups are for different types. > If you want to see what is taking the time then use a profiler. From isympy/ipython you can use %prun or if you want to run separate processes you can use python -m cProfile but that will also time how long it takes to import sympy so it's only useful for things that take at least several seconds. In the first run the time is taken by sin.eval and most of that by the extract_multiplicatively function. In [2]: %prun -s cumulative f(domain) 2069604 function calls (2047615 primitive calls) in 1.585 seconds Ordered by: cumulative time ncalls tottime percall cumtime percall filename:lineno(function) 1 0.000 0.000 1.585 1.585 {built-in method builtins.exec} 1 0.001 0.001 1.585 1.585 <string>:1(<module>) 1000 0.002 0.000 1.584 0.002 <lambdifygenerated-1>:1(_lambdifygenerated) 16989/1000 0.022 0.000 1.582 0.002 cache.py:69(wrapper) 1000 0.014 0.000 1.579 0.002 function.py:450(__new__) 1000 0.013 0.000 1.392 0.001 function.py:270(__new__) 1000 0.012 0.000 1.270 0.001 trigonometric.py:266(eval) 2997 0.029 0.000 0.988 0.000 expr.py:2183(extract_multiplicatively) 7993 0.020 0.000 0.835 0.000 assumptions.py:460(getit) 8993 0.336 0.000 0.722 0.000 facts.py:499(deduce_all_facts) 6993/2997 0.007 0.000 0.529 0.000 decorators.py:88(__sympifyit_wrapper) 2997 0.010 0.000 0.525 0.000 numbers.py:1315(__truediv__) 999 0.005 0.000 0.511 0.001 expr.py:2453(could_extract_minus_sign) 999 0.001 0.000 0.505 0.001 expr.py:1662(as_coefficient) 999 0.001 0.000 0.480 0.000 numbers.py:759(__truediv__) 999 0.001 0.000 0.478 0.000 decorators.py:254(_func) 999 0.001 0.000 0.476 0.000 decorators.py:129(binary_op_wrapper) 999 0.003 0.000 0.474 0.000 expr.py:260(__truediv__) 4996 0.015 0.000 0.459 0.000 assumptions.py:472(_ask) 999 0.015 0.000 0.457 0.000 operations.py:46(__new__) 999 0.037 0.000 0.408 0.000 mul.py:178(flatten) 3997 0.005 0.000 0.357 0.000 assumptions.py:444(copy) 3997 0.014 0.000 0.352 0.000 assumptions.py:432(__init__) 198823 0.074 0.000 0.254 0.000 {built-in method builtins.all} 570492 0.158 0.000 0.209 0.000 facts.py:533(<genexpr>) 1999 0.032 0.000 0.126 0.000 sets.py:1774(__new__) 183867 0.083 0.000 0.083 0.000 facts.py:482(_tell) 999 0.007 0.000 0.078 0.000 evalf.py:1425(evalf) 15987 0.023 0.000 0.077 0.000 sympify.py:92(sympify) 1000 0.003 0.000 0.066 0.000 function.py:214(nargs) 7996 0.026 0.000 0.065 0.000 compatibility.py:501(ordered) 6994 0.015 0.000 0.063 0.000 numbers.py:1197(_new) 29977 0.020 0.000 0.057 0.000 <frozen importlib._bootstrap>:1009(_handle_fromlist) 391655 0.054 0.000 0.054 0.000 {method 'get' of 'dict' objects} 1998/999 0.007 0.000 0.053 0.000 evalf.py:1332(evalf) 1998 0.053 0.000 0.053 0.000 mul.py:449(_gather) 999 0.005 0.000 0.046 0.000 evalf.py:781(evalf_trig) ... Now a second run in the same process just shows 1000 cache lookups: In [3]: %prun -s cumulative f(domain) 2003 function calls in 0.002 seconds Ordered by: cumulative time ncalls tottime percall cumtime percall filename:lineno(function) 1 0.000 0.000 0.002 0.002 {built-in method builtins.exec} 1 0.001 0.001 0.002 0.002 <string>:1(<module>) 1000 0.001 0.000 0.001 0.000 <lambdifygenerated-1>:1(_lambdifygenerated) 1000 0.001 0.000 0.001 0.000 cache.py:69(wrapper) 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects} This is the second run when the inputs are Expr: In [5]: %prun -s cumulative f(domain_sympy) 5003 function calls in 0.014 seconds Ordered by: cumulative time ncalls tottime percall cumtime percall filename:lineno(function) 1 0.000 0.000 0.014 0.014 {built-in method builtins.exec} 1 0.007 0.007 0.014 0.014 <string>:1(<module>) 1000 0.002 0.000 0.008 0.000 <lambdifygenerated-1>:1(_lambdifygenerated) 1000 0.002 0.000 0.006 0.000 cache.py:69(wrapper) 1000 0.001 0.000 0.004 0.000 numbers.py:1480(__hash__) 1000 0.001 0.000 0.002 0.000 numbers.py:806(__hash__) 1000 0.001 0.000 0.001 0.000 expr.py:126(__hash__) 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects} The cache lookups here are slower because the pure Python Expr.__hash__ methods are slower than for numpy's float64.__hash__ which is implemented in C. Either way this shows that the differences are all just to do with caching and are not representative of the actual time that it would take to compute these results once. The bulk of the time in the first run is ultimately consumed by deduce_all_facts i.e. the old assumptions. Note that the old assumptions are themselves cached on each Basic instance separately from the main cache which can also affect timing results if you don't use a separate process for timings: In [1]: e = sin(1) In [2]: e._assumptions Out[2]: {} In [3]: e.is_real Out[3]: True In [4]: e._assumptions Out[4]: {'real': True, 'positive': True, 'finite': True, 'infinite': False, 'extended_positive': True, 'extended_real': True, 'commutative': True, 'imaginary': False, 'hermitian': True, 'complex': True, 'extended_nonnegative': True, 'nonpositive': False, 'negative': False, 'extended_nonpositive': False, 'extended_nonzero': True, 'extended_negative': False, 'zero': False, 'nonnegative': True, 'nonzero': True} The deduce_all_facts function is the one that applies all of the implications relating these different assumption predicates so that as soon as one predicate (e.g. positive=True) is known then all of the others can be stored in the _assumptions dict. Although this makes things faster on a second assumptions query it does in fact consume the bulk of the time for computing many things in SymPy. Assumptions queries that take place during automatic evaluation are often a cause of slowness in SymPy. The bottom of the profile report shows evalf being called 2000 times i.e. twice per evaluation of the sin function. The evalf calls are part of the assumptions query and are reasonably fast for a simple sin(float) but can be very expensive for large expressions. -- Oscar -- You received this message because you are subscribed to the Google Groups "sympy" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/CAHVvXxRmA9AYL3TnAJeM_crc%2BKc9kb3ADtLR%2BptZEQKYPbKspw%40mail.gmail.com.
