How could the future of the fast expression evaluation be suggested? Would it be making the assumption system faster? Or would it be choosing carefully between 'easy assumptions' and 'hard assumptions' with time complexity analysis? Or would it be not using assumption system at all, but may be replaced by more faster (but limited) decision procedures like pattern matching or type theory?
On Sunday, April 17, 2022 at 5:45:12 PM UTC+3 Oscar wrote: > On Sun, 17 Apr 2022 at 15:02, David Bailey <[email protected]> wrote: > > > > On 17/04/2022 13:00, Oscar Benjamin wrote: > > > On Sun, 17 Apr 2022 at 11:47, David Bailey <[email protected]> > wrote: > > >> > > >> I am curious as to how much of the difference in speed between SymPy > and > > >> SymEngine you think is attributable to the lack of the optimal design > of > > >> the Python code, and how much do you think is attributable to the > choice > > >> of computer language. Clearly C++ is a much lower level language than > > >> Python, and would presumably be intrinsically much faster, but result > in > > >> many hard to fix memory corruption bugs. > > > It's hard to disentangle these things. Both SymPy and SymEngine will > > > do a lot of symbolic processing behind the scenes to produce the > > > output that users expect to see. In general SymPy will do a lot more > > > processing than SymEngine which means that expressions will not > > > evaluate in the same way but also means that a comparison of the two > > > for speed is not easy to interpret as being about e.g. C++ vs Python > > > or any other particular optimisation. > > > > > > For example one of the things that is often slow in SymPy when you > > > have large expressions is assumptions queries. Every time you create > > > an expression like an Add or a Mul or exp etc there is a lot of > > > processing that goes on to determine if the expression can simplify > > > and this often involves checking "assumptions" using the core > > > assumptions system e.g.: > > > > > > >>> n = symbols('n', integer=True) > > > >>> sin(n*pi) > > > 0 > > > > > > This kind of thing often dominates the runtime in SymPy when working > > > with large expressions. In SymEngine there are no assumptions and so > > > no assumptions checking is done at all: > > > > > > >>> from symengine import symbols, sin, pi > > > >>> n = symbols('n', integer=True) > > > >>> sin(n*pi) > > > sin(n*pi) > > > >>> n.is_integer > > > >>> False > > > > > > Having a simpler evaluation scheme would make SymPy faster while still > > > working in Python. There are many more examples of this where SymPy > > > just hasn't been clearly designed with performance always in mind. > > > Contributors are often unaware of the performance implications of the > > > changes that they make and it's very easy for a seemingly innocent fix > > > in one place to result in significant slowdowns elsewhere. > > > > > > > Maybe I'm missing something, but if assumptions are so costly, couldn't > > every expression contain a flag contains_assumptions to say if it > > contains assumptions. Then when a larger expression was created it would > > be simple to compute the contains_assumptions for the larger expression. > > > > E.g. in your example, the expression for n would have > > contains_assumptions set to 1, and this would propagate through any > > expression containing n. > > > > I think this scheme would work because expressions are immutable. > > > > Couldn't such a flag be used to speed things up by locally turning off > > the assumption checking? > > The core assumptions system isn't just about assumptions that are > defined on symbols: every expression has "assumptions" e.g.: > > In [1]: (1 + sqrt(2)).is_positive > Out[1]: True > > You can read more about it here: > https://docs.sympy.org/dev/guides/assumptions.html > > Typically what can be slow when manipulating large expressions is > actually expressions that don't involve symbols at all. For example to > answer the query (1 + sqrt(2)).is_positive the expression is > numerically evaluated with evalf which can be very expensive for large > expressions. This can also make operations with some expressions very > slow e.g. RootOf, Integral, etc. There is a tension between some > users/contributors wanting all assumptions queries to give a definite > answer and the fact that the "assumptions system" is invoked > (repeatedly) pretty much every time any new expression object is > created. > > -- > Oscar > -- You received this message because you are subscribed to the Google Groups "sympy" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/e99b97c2-4904-45be-852b-1a507915a22fn%40googlegroups.com.
