Re: [sympy] Improvements to how series are represented

Ondřej Čertík Wed, 11 Feb 2015 17:43:58 -0800

Hi Mario,

On Tue, Nov 26, 2013 at 5:38 AM, mario <[email protected]> wrote:
>
> In PR 609 and PR 2630 power series are implemented with dictionaries, that
> is in a sparse representation.
> The sparse representation is efficient when there are many variables. PR
> 2630 uses ``ring()``.
>
> For power series in one variable the dense representation is usually faster.
> In `compose_add` in PR 2630 series in one variable are used, so using the
> dense representation
> would be faster, but in a negligible way, since `compose_add` takes roughly
> 5% of the time in `minpoly`.
>
> In PR 609 there is lpoly.py, which deals with power series, and ltaylor.py.
> lpoly.py can now be implemented using ``ring()``; this is partially done in
> ring_series.py in PR 2630,
> where there are only the functions used in the computed sum. It seems to me
> that ring_series.py fits
> well in SymPy; if one wants to compute efficiently power series,
> one can use ``ring`` with ring_series.py functions (once all the series
> functions in lpoly.py are ported).
>
> To do a thorough treatment of power series, one should also implement
> the dense representation; consider the case of Laurent series;
> add functions to manipulate series at a higher level, possibly using a
> Series class with an Order function.
>
> Mateusz Paprocki might want to comment on this.
>
> PowerSeriesRing in Sage could be an example of higher level representation.
>
> ltaylor.py uses lpoly.py to compute the taylor series of SymPy functions;
> if the function is not smooth enough, the computation is passed to
> series.series.
> So if the function is smooth enough, the taylor series is computed fast
> (more or less the speed of `taylor` in Sage); otherwise it can be
> slightly slower than series.series.
> I do not know if it is a good approach; anyway, since it uses power series,
> for the moment I will leave it aside.


I finally got to this. I checked out your branch:

https://github.com/sympy/sympy/pull/609

Well, actually the Mateusz' improved branch, and I updated it further
to run with the latest gmpy2:
https://github.com/certik/sympy/tree/lpoly2-improved

And tested an example on my computer with gmpy:

In [1]: %time s = series(sin(x)*cos(x), x, 0, 1000)
CPU times: user 4.52 s, sys: 14 ms, total: 4.54 s
Wall time: 4.5 s

In [2]: %time s = series(log(1+x)*exp(x), x, 0, 500)
CPU times: user 4.15 s, sys: 14 ms, total: 4.16 s
Wall time: 4.13 s


Then I ran the same benchmark using the latest sympy master, using
your code that got merged at https://github.com/sympy/sympy/pull/2630,
I put the scripts here:
https://github.com/certik/sympy/compare/formal_power_series, it's just
the a.py and b.py. Your merged code doesn't seem to support sin/cos
benchmark [1] above, so I just ran the benchmark [2]:

https://github.com/certik/sympy/blob/59492520b443ea5f0ef31fc018e9bc700b93b818/b.py

$ python b.py
import
done
initialize series
done
start
stop
1.85430693626

Seems to be a little bit faster. Finally, I then wrote my own implementation:

https://github.com/certik/sympy/blob/59492520b443ea5f0ef31fc018e9bc700b93b818/a.py

$ python a.py
import
done
initialize series
done
start
1.17188405991
stop
1.17299985886


It's using the same algorithm, the speed improvement is just caused by
sorting the dictionary first before multiplying things out in rs_mul:

    items1 = list(p1._data.items())
    items1.sort(key=lambda e: e[0])
    items2 = list(p2._data.items())
    items2.sort(key=lambda e: e[0])


You only sort the items2 in sympy, while I noticed a speedup if I sort
items1 as well. But this is minor.


All this is using sparse representation, as a dictionary. I also
implemented dense representation, e.g. you can try the dense
representation on the example [2] by applying the following patch:

diff --git a/a.py b/a.py
index 5fabdd9..f7caf4a 100644
--- a/a.py
+++ b/a.py
@@ -179,20 +179,20 @@ def div(a, b):
 for i in range(1, n):
     t = t/i
     data.append(t)
-exp = FormalPowerSeriesSparse.from_dense(data)
+exp = FormalPowerSeries(data)
 # log(1+x)
 data = [QQ.dtype(0)]
 t = QQ.dtype(1)
 for i in range(1, n):
     data.append(t/i)
     t = -t
-log = FormalPowerSeriesSparse.from_dense(data)
+log = FormalPowerSeries(data)
 #x = FormalPowerSeries([0, 1, 0, 0, 0, 0, 0, 0, 0])
 #onemx = FormalPowerSeries([1, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
 print "done"
 print "start"
 t1 = clock()
-s = mul_sparse(log, exp, n)
+s = mul(log, exp)
 #s = mul(sin, cos)
 t2 = clock()
 print "stop"


Then you get:


$ python a.py
import
done
initialize series
done
start
1.11554312706
stop
1.11559510231

So it's a little bit faster, but the difference is small.

Conclusion --- a.py now contains simple implementation of the taylor
series that is as fast as Mario's fast code.


Anyway, I think this is the right approach, we should support more
functions (sin, cos, ...) in sympy.polys.ring_series and then create a
class like the FormalPowerSeriesSparse() in a.py that would hold the
series and would know how to print it, and also had methods to
manipulate it (or those can be external functions like in a.py).
Finally, this should then be hooked into our series expansion in
sympy.

Now, this works great for a Taylor expansion. We then need to extend
this to support Laurent series and eventually Puiseux series (that's
what SymPy's series is doing). An example of a Puiseux series is:

In [1]: %time print (1/cos(x/log(x))).series(x, 0, 10)
1 + x**2/(2*log(x)**2) + 5*x**4/(24*log(x)**4) +
61*x**6/(720*log(x)**6) + 277*x**8/(8064*log(x)**8) + O(x**10)
CPU times: user 3 s, sys: 14 ms, total: 3.01 s
Wall time: 3.03 s

As you can see, currently it takes 3s, and it should be immediate in
my opinion. You can read the references that I quoted in the docstring
of FormalPowerSeries() in a.py. Essentially, I think these are all
series of the form a*X^n, where n = 0, 1, 2, ... for Taylor series, n
= n0, n0+1, n0+2, ... for negative n0 for Laurent series, and X is a
symbol. For Puiseux series, the X is then something more complicated,
either a fractional power like x^(1/2), or something like x/log(x) in
the last example. But the contents of X is stored separately, and the
rest is just like polynomials. I am currently not sure if things like
multiplication of two Puiseux series needs to be changed somehow.

Sparse/Dense: I would just go with Sparse, it seems the speed is very
good both for really sparse series, as well as dense series.

Overall, I think this is the way to go. Let me know what you think.

Ondrej

-- 
You received this message because you are subscribed to the Google Groups 
"sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/sympy.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sympy/CADDwiVBnfJDrV0TKdUyGqeziMuEfCHvDkU_tt9OM7iWW3L-toA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: [sympy] Improvements to how series are represented

Reply via email to