On Thu, Apr 24, 2014 at 9:06 AM, Ondřej Čertík <[email protected]> wrote: > On Wed, Apr 23, 2014 at 10:49 PM, Joachim Durchholz <[email protected]> > wrote: >> Am 23.04.2014 23:54, schrieb Ondřej Čertík: >> >>> On Wed, Apr 23, 2014 at 3:52 PM, Aaron Meurer <[email protected]> wrote: >>>> >>>> I think the polys are fast because it can assume that the expressions >>>> are polynomials, which is a much simpler data structure than a general >>>> mathematical expression. >>> >>> >>> That's exactly right. If you want a general data structure but without >>> assumptions and as fast as possible, >>> in Python, look here: >>> >>> https://github.com/certik/sympyx >>> >>> There is also cythonized version. It's fast (much faster than SymPy), >>> but still slower than CSymPy. >> >> >> What are the speed differences? > > CSymPy: > > In [1]: from csympy import Symbol, Integer > > In [2]: x = Symbol("x") > > In [3]: N, a, c = 3000, x, Integer(1) > > In [4]: %time for i in range(N): a += c*x**i; c *= -1 > CPU times: user 1.04 s, sys: 0 ns, total: 1.04 s > Wall time: 1.04 s
I've done this benchmark in C++ (https://github.com/sympy/csympy/pull/163): for (int i = 0; i < N; i++) { a = add(a, mul(c, pow(x, integer(i)))); c = mul(c, integer(-1)); } $ ./add1 625ms number of terms: 2998 Aaron, so this is the current overhead of the Python wrappers --- almost 1.6x slower. Now the time is comparable to sympycore, but since sympycore is a mix of Python and C, it shows that things can be done faster. I need to look into this further. > > sympyx Cython: > > In [1]: from sympy_pyx import Symbol, Integer > > In [2]: x = Symbol("x") > > In [3]: N, a, c = 3000, x, Integer(1) > > In [4]: %time for i in range(N): a += c*x**i; c *= -1 > CPU times: user 10.8 s, sys: 0 ns, total: 10.8 s > Wall time: 10.8 s > > sympyx Python: > > In [1]: from sympy_py import Symbol, Integer > > In [2]: x = Symbol("x") > > In [3]: N, a, c = 3000, x, Integer(1) > > In [4]: %time for i in range(N): a += c*x**i; c *= -1 > CPU times: user 1min 42s, sys: 4 ms, total: 1min 42s > Wall time: 1min 42s > > SymPy: > > In [1]: from sympy import Symbol, Integer > > In [2]: x = Symbol("x") > > In [3]: N, a, c = 3000, x, Integer(1) > > In [4]: %time for i in range(N): a += c*x**i; c *= -1 > CPU times: user 25.5 s, sys: 20 ms, total: 25.6 s > Wall time: 25.6 s > > > So I guess sympyx is doing something stupid. I think it's the loop: > > class Add(Basic): > > ... > > @classmethod > def canonicalize(cls, args): > use_glib = 0 > if use_glib: > from csympy import HashTable > d = HashTable() > else: > d = {} > num = Integer(0) > for a in args: > if a.type == INTEGER: > num += a > elif a.type == ADD: > for b in a.args: > if b.type == INTEGER: > num += b > else: > coeff, key = b.as_coeff_rest() > if key in d: > d[key] += coeff > else: > d[key] = coeff > ... > > So it loops over the dictionary in Add, which is inefficient. In > CSymPy, we copy the dictionary and add to it directly: > > RCP<const Basic> add(const RCP<const Basic> &a, const RCP<const Basic> &b) > { > CSymPy::umap_basic_int d; > RCP<const Number> coef; > RCP<const Basic> t; > ... > } else if (CSymPy::is_a<Add>(*a)) { > coef = (rcp_static_cast<const Add>(a))->coef_; > d = (rcp_static_cast<const Add>(a))->dict_; > if (is_a_Number(*b)) { > iaddnum(outArg(coef), rcp_static_cast<const Number>(b)); > } else { > RCP<const Number> coef2; > Add::as_coef_term(b, outArg(coef2), outArg(t)); > Add::dict_add_term(d, coef2, t); > } > > > >> Are the three versions doing the same thing? > > Good question, I thought it is, but it's apparently not, as I just > found out, as sympyx is using a list for arguments of Add, instead of > a dictionary. I also just tried sympycore: > > In [1]: from sympycore import Symbol, Calculus > SparsepolyHead.evalf does not define documentation string > > In [2]: x = Symbol("x") > > In [3]: N, a, c = 3000, x, Calculus(1) > > In [4]: %time for i in range(N): a += c*x**i; c *= -1 > CPU times: user 52 ms, sys: 0 ns, total: 52 ms > Wall time: 50.2 ms > > > So this sets the bar for this benchmark --- but one should note that > sympycore's "a" is mutable. If you do this: > > In [4]: %time for i in range(N): a = a + c*x**i; c *= -1 > CPU times: user 548 ms, sys: 144 ms, total: 692 ms > Wall time: 693 ms > > > Then it's closer. Still it's faster than CSymPy, so there is room for > improvement. I am glad I tried this, sympycore must be using better > algorithms. Though on this benchmark: > > In [1]: from sympycore import Symbol, Calculus > SparsepolyHead.evalf does not define documentation string > > In [2]: x = Symbol("x") > > In [3]: y = Symbol("y") > > In [4]: z = Symbol("z") > > In [5]: w = Symbol("w") > > In [6]: e = ((x+y+z+w)**15+w)*(x+y+z+w)**15 > > In [7]: e > Out[7]: Calculus('(y + x + z + w)**15*(w + (y + x + z + w)**15)') > > In [8]: %time f = e.expand() > CPU times: user 5.02 s, sys: 144 ms, total: 5.16 s > Wall time: 5.16 s > > > vs > > In [1]: from csympy import var > > In [2]: var("x y z w") > Out[2]: (x, y, z, w) > > In [3]: e = ((x+y+z+w)**15+w)*(x+y+z+w)**15 > > In [4]: %time f = e.expand() > CPU times: user 2 s, sys: 4 ms, total: 2 s > Wall time: 2 s > > > So more time is needed to spend on this to see why is sympycore faster > on something, but slower on something else. > > Ondrej -- You received this message because you are subscribed to the Google Groups "sympy" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/sympy. To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/CADDwiVALYCQ%2B6chg9xBu0M8yTZGBLbkOtHeEXMRp861QdLN_GA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
