On Thu, Apr 24, 2014 at 9:06 AM, Ondřej Čertík <[email protected]> wrote:
> On Wed, Apr 23, 2014 at 10:49 PM, Joachim Durchholz <[email protected]> 
> wrote:
>> Am 23.04.2014 23:54, schrieb Ondřej Čertík:
>>
>>> On Wed, Apr 23, 2014 at 3:52 PM, Aaron Meurer <[email protected]> wrote:
>>>>
>>>> I think the polys are fast because it can assume that the expressions
>>>> are polynomials, which is a much simpler data structure than a general
>>>> mathematical expression.
>>>
>>>
>>> That's exactly right. If you want a general data structure but without
>>> assumptions and as fast as possible,
>>> in Python, look here:
>>>
>>> https://github.com/certik/sympyx
>>>
>>> There is also cythonized version. It's fast (much faster than SymPy),
>>> but still slower than CSymPy.
>>
>>
>> What are the speed differences?
>
> CSymPy:
>
> In [1]: from csympy import Symbol, Integer
>
> In [2]: x = Symbol("x")
>
> In [3]: N, a, c = 3000, x, Integer(1)
>
> In [4]: %time for i in range(N): a += c*x**i; c *= -1
> CPU times: user 1.04 s, sys: 0 ns, total: 1.04 s
> Wall time: 1.04 s

I've done this benchmark in C++ (https://github.com/sympy/csympy/pull/163):

    for (int i = 0; i < N; i++) {
        a = add(a, mul(c, pow(x, integer(i))));
        c = mul(c, integer(-1));
    }


$ ./add1
625ms
number of terms: 2998


Aaron, so this is the current overhead of the Python wrappers ---
almost 1.6x slower.
Now the time is comparable to sympycore, but since sympycore is a mix
of Python and C,
it shows that things can be done faster. I need to look into this further.

>
> sympyx Cython:
>
> In [1]: from sympy_pyx import Symbol, Integer
>
> In [2]: x = Symbol("x")
>
> In [3]: N, a, c = 3000, x, Integer(1)
>
> In [4]: %time for i in range(N): a += c*x**i; c *= -1
> CPU times: user 10.8 s, sys: 0 ns, total: 10.8 s
> Wall time: 10.8 s
>
> sympyx Python:
>
> In [1]: from sympy_py import Symbol, Integer
>
> In [2]: x = Symbol("x")
>
> In [3]: N, a, c = 3000, x, Integer(1)
>
> In [4]: %time for i in range(N): a += c*x**i; c *= -1
> CPU times: user 1min 42s, sys: 4 ms, total: 1min 42s
> Wall time: 1min 42s
>
> SymPy:
>
> In [1]: from sympy import Symbol, Integer
>
> In [2]: x = Symbol("x")
>
> In [3]: N, a, c = 3000, x, Integer(1)
>
> In [4]: %time for i in range(N): a += c*x**i; c *= -1
> CPU times: user 25.5 s, sys: 20 ms, total: 25.6 s
> Wall time: 25.6 s
>
>
> So I guess sympyx is doing something stupid. I think it's the loop:
>
> class Add(Basic):
>
> ...
>
>     @classmethod
>     def canonicalize(cls, args):
>         use_glib = 0
>         if use_glib:
>             from csympy import HashTable
>             d = HashTable()
>         else:
>             d = {}
>         num = Integer(0)
>         for a in args:
>             if a.type == INTEGER:
>                 num += a
>             elif a.type == ADD:
>                 for b in a.args:
>                     if b.type == INTEGER:
>                         num += b
>                     else:
>                         coeff, key = b.as_coeff_rest()
>                         if key in d:
>                             d[key] += coeff
>                         else:
>                             d[key] = coeff
> ...
>
> So it loops over the dictionary in Add, which is inefficient. In
> CSymPy, we copy the dictionary and add to it directly:
>
> RCP<const Basic> add(const RCP<const Basic> &a, const RCP<const Basic> &b)
> {
>     CSymPy::umap_basic_int d;
>     RCP<const Number> coef;
>     RCP<const Basic> t;
> ...
>     } else if (CSymPy::is_a<Add>(*a)) {
>         coef = (rcp_static_cast<const Add>(a))->coef_;
>         d = (rcp_static_cast<const Add>(a))->dict_;
>         if (is_a_Number(*b)) {
>             iaddnum(outArg(coef), rcp_static_cast<const Number>(b));
>         } else {
>             RCP<const Number> coef2;
>             Add::as_coef_term(b, outArg(coef2), outArg(t));
>             Add::dict_add_term(d, coef2, t);
>         }
>
>
>
>> Are the three versions doing the same thing?
>
> Good question, I thought it is, but it's apparently not, as I just
> found out, as sympyx is using a list for arguments of Add, instead of
> a dictionary. I also just tried sympycore:
>
> In [1]: from sympycore import Symbol, Calculus
> SparsepolyHead.evalf does not define documentation string
>
> In [2]: x = Symbol("x")
>
> In [3]: N, a, c = 3000, x, Calculus(1)
>
> In [4]: %time for i in range(N): a += c*x**i; c *= -1
> CPU times: user 52 ms, sys: 0 ns, total: 52 ms
> Wall time: 50.2 ms
>
>
> So this sets the bar for this benchmark --- but one should note that
> sympycore's "a" is mutable. If you do this:
>
> In [4]: %time for i in range(N): a = a + c*x**i; c *= -1
> CPU times: user 548 ms, sys: 144 ms, total: 692 ms
> Wall time: 693 ms
>
>
> Then it's closer. Still it's faster than CSymPy, so there is room for
> improvement. I am glad I tried this, sympycore must be using better
> algorithms. Though on this benchmark:
>
> In [1]: from sympycore import Symbol, Calculus
> SparsepolyHead.evalf does not define documentation string
>
> In [2]: x = Symbol("x")
>
> In [3]: y = Symbol("y")
>
> In [4]: z = Symbol("z")
>
> In [5]: w = Symbol("w")
>
> In [6]: e = ((x+y+z+w)**15+w)*(x+y+z+w)**15
>
> In [7]: e
> Out[7]: Calculus('(y + x + z + w)**15*(w + (y + x + z + w)**15)')
>
> In [8]: %time f = e.expand()
> CPU times: user 5.02 s, sys: 144 ms, total: 5.16 s
> Wall time: 5.16 s
>
>
> vs
>
> In [1]: from csympy import var
>
> In [2]: var("x y z w")
> Out[2]: (x, y, z, w)
>
> In [3]: e = ((x+y+z+w)**15+w)*(x+y+z+w)**15
>
> In [4]: %time f = e.expand()
> CPU times: user 2 s, sys: 4 ms, total: 2 s
> Wall time: 2 s
>
>
> So more time is needed to spend on this to see why is sympycore faster
> on something, but slower on something else.
>
> Ondrej

-- 
You received this message because you are subscribed to the Google Groups 
"sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/sympy.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sympy/CADDwiVALYCQ%2B6chg9xBu0M8yTZGBLbkOtHeEXMRp861QdLN_GA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to