On Wed, Apr 23, 2014 at 10:49 PM, Joachim Durchholz <[email protected]> wrote:
> Am 23.04.2014 23:54, schrieb Ondřej Čertík:
>
>> On Wed, Apr 23, 2014 at 3:52 PM, Aaron Meurer <[email protected]> wrote:
>>>
>>> I think the polys are fast because it can assume that the expressions
>>> are polynomials, which is a much simpler data structure than a general
>>> mathematical expression.
>>
>>
>> That's exactly right. If you want a general data structure but without
>> assumptions and as fast as possible,
>> in Python, look here:
>>
>> https://github.com/certik/sympyx
>>
>> There is also cythonized version. It's fast (much faster than SymPy),
>> but still slower than CSymPy.
>
>
> What are the speed differences?

CSymPy:

In [1]: from csympy import Symbol, Integer

In [2]: x = Symbol("x")

In [3]: N, a, c = 3000, x, Integer(1)

In [4]: %time for i in range(N): a += c*x**i; c *= -1
CPU times: user 1.04 s, sys: 0 ns, total: 1.04 s
Wall time: 1.04 s

sympyx Cython:

In [1]: from sympy_pyx import Symbol, Integer

In [2]: x = Symbol("x")

In [3]: N, a, c = 3000, x, Integer(1)

In [4]: %time for i in range(N): a += c*x**i; c *= -1
CPU times: user 10.8 s, sys: 0 ns, total: 10.8 s
Wall time: 10.8 s

sympyx Python:

In [1]: from sympy_py import Symbol, Integer

In [2]: x = Symbol("x")

In [3]: N, a, c = 3000, x, Integer(1)

In [4]: %time for i in range(N): a += c*x**i; c *= -1
CPU times: user 1min 42s, sys: 4 ms, total: 1min 42s
Wall time: 1min 42s

SymPy:

In [1]: from sympy import Symbol, Integer

In [2]: x = Symbol("x")

In [3]: N, a, c = 3000, x, Integer(1)

In [4]: %time for i in range(N): a += c*x**i; c *= -1
CPU times: user 25.5 s, sys: 20 ms, total: 25.6 s
Wall time: 25.6 s


So I guess sympyx is doing something stupid. I think it's the loop:

class Add(Basic):

...

    @classmethod
    def canonicalize(cls, args):
        use_glib = 0
        if use_glib:
            from csympy import HashTable
            d = HashTable()
        else:
            d = {}
        num = Integer(0)
        for a in args:
            if a.type == INTEGER:
                num += a
            elif a.type == ADD:
                for b in a.args:
                    if b.type == INTEGER:
                        num += b
                    else:
                        coeff, key = b.as_coeff_rest()
                        if key in d:
                            d[key] += coeff
                        else:
                            d[key] = coeff
...

So it loops over the dictionary in Add, which is inefficient. In
CSymPy, we copy the dictionary and add to it directly:

RCP<const Basic> add(const RCP<const Basic> &a, const RCP<const Basic> &b)
{
    CSymPy::umap_basic_int d;
    RCP<const Number> coef;
    RCP<const Basic> t;
...
    } else if (CSymPy::is_a<Add>(*a)) {
        coef = (rcp_static_cast<const Add>(a))->coef_;
        d = (rcp_static_cast<const Add>(a))->dict_;
        if (is_a_Number(*b)) {
            iaddnum(outArg(coef), rcp_static_cast<const Number>(b));
        } else {
            RCP<const Number> coef2;
            Add::as_coef_term(b, outArg(coef2), outArg(t));
            Add::dict_add_term(d, coef2, t);
        }



> Are the three versions doing the same thing?

Good question, I thought it is, but it's apparently not, as I just
found out, as sympyx is using a list for arguments of Add, instead of
a dictionary. I also just tried sympycore:

In [1]: from sympycore import Symbol, Calculus
SparsepolyHead.evalf does not define documentation string

In [2]: x = Symbol("x")

In [3]: N, a, c = 3000, x, Calculus(1)

In [4]: %time for i in range(N): a += c*x**i; c *= -1
CPU times: user 52 ms, sys: 0 ns, total: 52 ms
Wall time: 50.2 ms


So this sets the bar for this benchmark --- but one should note that
sympycore's "a" is mutable. If you do this:

In [4]: %time for i in range(N): a = a + c*x**i; c *= -1
CPU times: user 548 ms, sys: 144 ms, total: 692 ms
Wall time: 693 ms


Then it's closer. Still it's faster than CSymPy, so there is room for
improvement. I am glad I tried this, sympycore must be using better
algorithms. Though on this benchmark:

In [1]: from sympycore import Symbol, Calculus
SparsepolyHead.evalf does not define documentation string

In [2]: x = Symbol("x")

In [3]: y = Symbol("y")

In [4]: z = Symbol("z")

In [5]: w = Symbol("w")

In [6]: e = ((x+y+z+w)**15+w)*(x+y+z+w)**15

In [7]: e
Out[7]: Calculus('(y + x + z + w)**15*(w + (y + x + z + w)**15)')

In [8]: %time f = e.expand()
CPU times: user 5.02 s, sys: 144 ms, total: 5.16 s
Wall time: 5.16 s


vs

In [1]: from csympy import var

In [2]: var("x y z w")
Out[2]: (x, y, z, w)

In [3]: e = ((x+y+z+w)**15+w)*(x+y+z+w)**15

In [4]: %time f = e.expand()
CPU times: user 2 s, sys: 4 ms, total: 2 s
Wall time: 2 s


So more time is needed to spend on this to see why is sympycore faster
on something, but slower on something else.

Ondrej

-- 
You received this message because you are subscribed to the Google Groups 
"sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/sympy.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sympy/CADDwiVCrkjEE_S5tWScU7_cni%2B1XUdND7X%3DY0oRKA9x_Q8429A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to