On Thu, Apr 24, 2014 at 12:10 AM, mario <[email protected]> wrote: > Ondřej, you wrote: > For the sympy.polys in this example, > is it using the sparse polynomial representation? I.e. it stores the > symbols (x, y, z) and then stores their powers and coefficients > in a dictionary, e.g.: > > 3*x^2*z + 2*x -> {3: (2, 0, 1), 2: (1, 0, 0)} > > ? > > In CSymPy, I am still implementing this datastructure, as it is very > efficient, but it only works for polynomials. So at the moment, we > can't benchmark this yet. > > In the dictionary in that example you meant {(2,0,1): 3, (1,0,0):2}
Ah, yes, of course. > It is not fast with many variables; typically many of the exponents are > zero, so > it is not efficient to iterate on an array of exponents, which are mostly > zero. > It is faster to use a data structure keeping only the nonzero exponents, > like > the ETuple used in PolyDict in Sage. The code for ETuple is short and > simple. Here is the documentation: http://www.sagemath.org/doc/reference/polynomial_rings/sage/rings/polynomial/polydict.html Thanks for the tips. In CSymPy, here is the implementation for sparse polynomial multiplication: https://github.com/sympy/csympy/blob/master/src/rings.cpp#L63 and 'umap_vec_mpz' is defined here: https://github.com/sympy/csympy/blob/master/src/dict.h#L67 If you have tips how to implement ETuple in C++ and improve upon this datastructure, please let me know. Ondrej > > On Wednesday, April 23, 2014 5:35:59 PM UTC+2, Ondřej Čertík wrote: >> >> On Wed, Apr 23, 2014 at 9:21 AM, Mateusz Paprocki <[email protected]> >> wrote: >> > Hi, >> > >> > On 23 April 2014 07:36, Ondřej Čertík <[email protected]> wrote: >> >> On Tue, Apr 22, 2014 at 9:52 PM, Aaron Meurer <[email protected]> >> >> wrote: >> >>> On Tue, Apr 22, 2014 at 10:21 PM, Ondřej Čertík <[email protected]> >> >>> wrote: >> >>>> On Tue, Apr 22, 2014 at 6:06 PM, Aaron Meurer <[email protected]> >> >>>> wrote: >> >>>>> On Tue, Apr 22, 2014 at 12:05 PM, Ondřej Čertík >> >>>>> <[email protected]> wrote: >> >>>>>> Hi Aaron, >> >>>>>> >> >>>>>> Those are good questions. Here are the answers: >> >>>>>> >> >>>>>> On Tue, Apr 22, 2014 at 10:13 AM, Aaron Meurer <[email protected]> >> >>>>>> wrote: >> >>>>>>> I have some high level questions about CSymPy. >> >>>>>>> >> >>>>>>> - What are the goals of the project? >> >>>>>> >> >>>>>> The goals of the project are: >> >>>>>> >> >>>>>> * Fastest symbolic manipulation library, compared to other codes, >> >>>>>> commercial or opensource >> >>>>>> (Sage, GiNaC, Mathematica, ...). >> >>>>>> >> >>>>>> * Extension/complement to SymPy >> >>>>>> >> >>>>>> * If the above two goals allow, be able to also call it from other >> >>>>>> languages easily and efficiently (Julia, Ruby, Mathematica, ...) >> >>>>>> >> >>>>>> As to technical solution: the core should be a C++ library, which >> >>>>>> can >> >>>>>> depend on other compiled libraries if needed. >> >>>>>> The core should not depend on Python or Ruby or Julia, but rather >> >>>>>> be >> >>>>>> just one language, C++. That lowers the barrier >> >>>>>> of entry significantly, compared to a big mix of C++, Cython and >> >>>>>> Python, makes it easier to make things fast >> >>>>>> (you don't need to worry about Python at all). The Python (and >> >>>>>> other >> >>>>>> languages) wrappers should be just a thin >> >>>>>> wrappers around the C++ core (=just better syntax). >> >>>>>> >> >>>>>> There might be other technical solutions to this, but I know that I >> >>>>>> can deliver the above goals with this solution >> >>>>>> (and I failed to deliver with other solutions, like writing the >> >>>>>> core >> >>>>>> in Cython). So that's why we do it this way. >> >>>>>> >> >>>>>> Also, by being "just a C++ library", other people can use it in >> >>>>>> their >> >>>>>> projects. I hope to get interest of much broader >> >>>>>> community that way, who can contribute back (somebody will need >> >>>>>> fast >> >>>>>> symbolic manipulation in Julia, so they >> >>>>>> can just use CSymPy with Julia wrappers, and contribute >> >>>>>> improvements back). >> >>>>>> >> >>>>>>> >> >>>>>>> - What are the things that should definitely go in CSymPy? >> >>>>>> >> >>>>>> At the moment: all things to make specific applications fast, in >> >>>>>> particular PyDy. For that, it needs basic >> >>>>>> manipulation, differentiation, series expansion (I think) and >> >>>>>> matrices. That's all roughly either done, or >> >>>>>> on the way. Of course, lots of polishing is needed. >> >>>>> >> >>>>> I think that's already too much. Why is the series expansion slow in >> >>>>> SymPy? Is it because the algorithms are slow? If so, then >> >>>>> implementing >> >>>>> the same inefficient algorithms in CSymPy won't help. They will be >> >>>>> faster, but for large enough expressions they will still slow down. >> >>>>> Is >> >>>>> it because the expression manipulation is slow? In that case, if >> >>>>> CSymPy has faster expression manipulation, then just use those >> >>>>> expressions, but use the SymPy series algorithms. >> >>>> >> >>>> My experience is that it will actually help to implement the same >> >>>> algorithm, >> >>>> because there is a little overhead with any Python operation. >> >>> >> >>> Exactly. There is a "little" overhead. Not a huge overhead. It matters >> >>> for the stuff that is in the inner loops, like addition and >> >>> multiplication of terms, but for whole algorithms, which might be >> >>> called only a few times (as opposed to a few hundred thousand times), >> >>> it doesn't make a difference. >> >>> >> >>> This is all hypothetical without numbers (and btw, it would be awesome >> >>> if you could provide real numbers here), but suppose these imaginary >> >>> numbers were true: >> >>> >> >>> SymPy: 1x >> >>> CSymPy with Python wrappers: 4x >> >>> Raw CSymPy: 5x >> >>> >> >>> Then using CSymPy with Python would already be 4x faster than SymPy. >> >>> Now doing everything in SymPy would only be 1.25 faster than that. >> >>> >> >>> Now, if CSymPy integrates flawlessly, so that it just works (at least >> >>> as far as the user is concerned), there is little complexity cost of >> >>> CSymPy + Python. Definitely little enough to warrant the 4x speedup. >> >>> But as soon as you take that away, i.e., you implement more and more >> >>> in C++, or CSymPy differs enough from SymPy that the user needs to >> >>> care about it (which the more that is in CSymPy, the more likely this >> >>> is to happen), then the complexity cost sky rockets. Maybe 4x would >> >>> still be worth it here. But not 1.25x. >> >>> >> >>>>So if you do >> >>>> a lot of them (like in series expansion, where you create >> >>>> intermediate >> >>>> expressions >> >>>> and so on --- and even if we use CSymPy, there is overhead in the >> >>>> Python wrappers, >> >>>> so essentially you don't want to be calling them too often, if you >> >>>> *really* care >> >>>> about performance), they will add up. >> >>> >> >>> Sure, you could just implement a whole CAS in C++. That's what some >> >>> people have done already. But you have to factor in the costs of >> >>> everything, not just the speed. The costs of: >> >>> >> >>> - How much more complicated the code is >> >>> - Code duplication (and all the associated issues that come with it) >> >>> - The additional overhead needed for CSymPy to interop with SymPy. The >> >>> more CSymPy does, the harder this is. >> >>> >> >>> Is series expansion slow? Is it an inner loop (i.e., will it matter to >> >>> people if it is slow)? Is it simple to implement ('simple' being a >> >>> relative term of course; obviously no part of a CAS is completely >> >>> simple)? If the answer to any of those is "no", I think you should >> >>> seriously consider whether it's worth implementing. >> >>> >> >>>> >> >>>>> >> >>>>> My points are: >> >>>>> >> >>>>> - I think CSymPy should focus on making expression manipulation fast >> >>>>> (i.e., the things that are the inner loop of any symbolic >> >>>>> algorithm). >> >>>>> It should not reimplement the symbolic algorithms themselves. Those >> >>>>> are implemented in SymPy. If they use the CSymPy objects instead of >> >>>>> the SymPy objects, they will be faster. >> >>>>> >> >>>>> - I would focus more on making CSymPy interoperate with SymPy and >> >>>>> less >> >>>>> on reimplementing things that are in SymPy in CSymPy. Once there is >> >>>>> interoperation, we can see what is still slow, and then (and only >> >>>>> then) implement it in C++. >> >>>> >> >>>> The interoperability is important and that is mostly the job of the >> >>>> Python wrappers. >> >>>> The C++ API underneath can change a lot, i.e. if we figure out a >> >>>> faster way to represent >> >>>> things, we'll switch. I will spend lots of time making the >> >>>> interoperability work. >> >>>> This summer though, I want to think hard about raw speed and making >> >>>> it work >> >>>> for PyDy. >> >>>> >> >>>>> >> >>>>>> >> >>>>>>> >> >>>>>>> - What are the things that should definitely not go in CSymPy? >> >>>>>> >> >>>>>> Things that don't need to be fast. Things like limits. Also things >> >>>>>> that are in SymPy, where CSymPy can >> >>>>>> be used as a drop in replacement for the engine: PyDy, some stuff >> >>>>>> in >> >>>>>> physics, and so on. There is no need >> >>>>>> to rewrite PyDy in C++. Also most user's code would stay in Python. >> >>>>>> They can just optionally change >> >>>>>> to CSymPy for some intensive calculation, then finish the thing >> >>>>>> with SymPy. >> >>>>>> >> >>>>>>> >> >>>>>>> - How will CSymPy be architectured to allow things to happen in >> >>>>>>> CSymPy >> >>>>>>> when they can but fallback to SymPy when they cannot. >> >>>>>> >> >>>>>> Currently you can simply mix and match SymPy and CSymPy >> >>>>>> expressions. >> >>>>>> So you simply >> >>>>>> convert an expression to SymPy to do some advanced manipulation, >> >>>>>> and >> >>>>>> convert to CSymPy >> >>>>>> to do some fast manipulation. I am open to suggestions how to >> >>>>>> improve this. >> >>>>>> >> >>>>>>> >> >>>>>>> My main concern here is that CSymPy has not clear separation from >> >>>>>>> SymPy, and as a result it will end up growing larger and larger, >> >>>>>>> until >> >>>>>>> it becomes an independent CAS (which is fine if that's the goal, >> >>>>>>> but >> >>>>>>> my understanding was that it was supposed to be just a small fast >> >>>>>>> core). >> >>>>>> >> >>>>>> The goals are written above. I am myself concentrating on speed, >> >>>>>> that's what I really >> >>>>>> want to nail down. And then enough features so that it's useful for >> >>>>>> all the people who >> >>>>>> found SymPy slow. However, let's say somebody comes and >> >>>>>> reimplements the Gruntz >> >>>>>> algorithm in CSymPy. Should we reject such a PR? My answer is that >> >>>>>> if >> >>>>>> the code is nice, >> >>>>>> maintainable, much faster than SymPy and has the same or similar >> >>>>>> features, I am ok >> >>>>>> with merging it. If the code is a mess, then not. But as I said, I >> >>>>>> am >> >>>>>> spending my own >> >>>>>> time on things which people need, and faster limits don't seem to >> >>>>>> be it. >> >>>>>> >> >>>>>>> >> >>>>>>> In particular, if there is some feature of SymPy functions, how >> >>>>>>> will >> >>>>>>> CSymPy be architectured so that it can take advantage of it >> >>>>>>> without >> >>>>>>> having to completely reimplement that function in C++? >> >>>>>> >> >>>>>> You can convert any expression back and forth, so you keep it in >> >>>>>> SymPy >> >>>>>> if you want to have some particular feature. See also the >> >>>>>> conversation >> >>>>>> and specific examples here: >> >>>>>> >> >>>>>> https://github.com/sympy/csympy/issues/153 >> >>>>>> >> >>>>>>> >> >>>>>>> For instance, a current goal of CSymPy is to implement trig >> >>>>>>> functions. >> >>>>>>> But this can be quite complicated if you consider all the >> >>>>>>> different >> >>>>>>> things you can do with trig functions. Without even thinking about >> >>>>>>> trig simplification, there are complicated evaluation issues >> >>>>>>> (e.g., >> >>>>>>> consider sin(pi/7).rewrite(sqrt) in SymPy). It would be a shame to >> >>>>>>> reimplement all this logic twice, especially it is not needed for >> >>>>>>> performance. >> >>>>>> >> >>>>>> Agreed. On the other hand, we really need very fast trig functions. >> >>>>>> The functionality >> >>>>>> that we need is simplifications like sin(2*pi) -> 0, >> >>>>>> differentiation >> >>>>>> and series expansion. >> >>>>> >> >>>>> Why do you need to implement those in C++? If the expression >> >>>>> manipulation is fast, then won't it be fine to have the actual >> >>>>> formulas/algorithms in SymPy? >> >>>> >> >>>> Maybe, that depends on this issue: >> >>>> >> >>>> https://github.com/sympy/csympy/issues/153 >> >>>> >> >>>> The problem is that once you start thinking about Python+C++ at once >> >>>> and performance, things get complex quickly. It's much easier >> >>>> to think in terms of C++ only and how to write the fastest possible >> >>>> algorithm >> >>>> (that is hard enough!). This sets the bar. Then one should try to see >> >>>> if it is possible to match this with Python. Not the other way round, >> >>>> because you need to set the bar first. >> >>>> >> >>>> >> >>>> >> >>>> On Tue, Apr 22, 2014 at 6:16 PM, Aaron Meurer <[email protected]> >> >>>> wrote: >> >>>>> On Tue, Apr 22, 2014 at 4:58 PM, Joachim Durchholz >> >>>>> <[email protected]> wrote: >> >>>>>> Hm. >> >>>>>> >> >>>>>> One: >> >>>>>>> * Extension/complement to SymPy >> >>>>>> >> >>>>>> Two: >> >>>>>> >> >>>>>> >> >>>>>>> That lowers the barrier >> >>>>>>> of entry significantly, compared to a big mix of C++, Cython and >> >>>>>>> Python, makes it easier to make things fast >> >>>>>>> (you don't need to worry about Python at all). >> >>>>>> >> >>>>>> >> >>>>>> That's not an extension nor a complement, it's a replacement. >> >>>>>> >> >>>>>> >> >>>>>>> The Python (and other >> >>>>>>> >> >>>>>>> languages) wrappers should be just a thin >> >>>>>>> wrappers around the C++ core (=just better syntax). >> >>>>>> >> >>>>>> >> >>>>>> I.e. replace the engine if not the API. >> >>>>>> >> >>>>>> Not that I'm judging. I'm just pointing out perceived >> >>>>>> inconsistencies. >> >>>>>> >> >>>>>> The more SymPy itself turns into a set of simplification rulese, >> >>>>>> the less >> >>>>>> significance this will have in the end. >> >>>>>> >> >>>>>> >> >>>>>>>> - How will CSymPy be architectured to allow things to happen in >> >>>>>>>> CSymPy >> >>>>>>>> when they can but fallback to SymPy when they cannot. >> >>>>>>> >> >>>>>>> >> >>>>>>> Currently you can simply mix and match SymPy and CSymPy >> >>>>>>> expressions. >> >>>>>>> So you simply >> >>>>>>> convert an expression to SymPy to do some advanced manipulation, >> >>>>>>> and >> >>>>>>> convert to CSymPy >> >>>>>>> to do some fast manipulation. I am open to suggestions how to >> >>>>>>> improve >> >>>>>>> this. >> >>>>>> >> >>>>>> >> >>>>>> The alternative would be to have a data structure that can be >> >>>>>> manipulated >> >>>>>> from both the C++ and the Python side, but that's going to be >> >>>>>> unnatural for >> >>>>>> at least one of the sides. >> >>>>>> >> >>>>>> Note that the data structures can become large-ish, and if the >> >>>>>> simplification becomes complicated there may be a lot of >> >>>>>> back-and-forth. >> >>>>>> It's possible that mixed execution will be slow for some algorithms >> >>>>>> or use >> >>>>>> cases for that reason. >> >>>>>> >> >>>>>> I do not think that this can be determined in advance, it's >> >>>>>> something to >> >>>>>> keep an eye out for during benchmarks. >> >>>>>> >> >>>>>> >> >>>>>>>> My main concern here is that CSymPy has not clear separation from >> >>>>>>>> SymPy, and as a result it will end up growing larger and larger, >> >>>>>>>> until >> >>>>>>>> it becomes an independent CAS (which is fine if that's the goal, >> >>>>>>>> but >> >>>>>>>> my understanding was that it was supposed to be just a small fast >> >>>>>>>> core). >> >>>>>>> >> >>>>>>> >> >>>>>>> The goals are written above. I am myself concentrating on speed, >> >>>>>>> that's what I really >> >>>>>>> want to nail down. >> >>>>>> >> >>>>>> >> >>>>>> I'm somewhat sceptical about this. >> >>>>>> A conversion to C++ will give a linear improvement. >> >>>>>> Better algorithms can improve the big-Oh class. >> >>>>>> Unless algorithmic improvements have been exhausted, this would be >> >>>>>> premature >> >>>>>> optimization. (Are algorithmic improvements exhausted yet?) >> >>>>> >> >>>>> That is true. I usually prefer to use faster algorithms. >> >>>>> >> >>>>> But to be the devil's advocate, there are two issues with this line >> >>>>> of thinking: >> >>>>> >> >>>>> - Big O is basically useless. Consider the extreme effectiveness of >> >>>>> SAT solvers (which solve an NP-complete problem), or the difference >> >>>>> between the simplex and Khachiyan's algorithm, or AKS vs. more >> >>>>> efficient deterministic primality testing algorithms. Asymptotic >> >>>>> complexity is all fine, but at the end of the day, you don't care >> >>>>> how >> >>>>> fast your algorithm is for increasingly large inputs, you care how >> >>>>> fast it is for *your* input. >> >>>>> >> >>>>> - Faster algorithms have a complexity cost. You can get closer to >> >>>>> the >> >>>>> metal in Python by being very careful about your use of data >> >>>>> structures, and avoiding things that are slow in Python (like >> >>>>> function >> >>>>> calls), but the cost is high because you end up with code that is >> >>>>> not >> >>>>> only harder to read and maintain, but harder to keep in its fast >> >>>>> state, because someone else who doesn't know all the little tricks >> >>>>> might come along and change things in a way that seems equivalent >> >>>>> but >> >>>>> makes things slower. >> >>>> >> >>>> Precisely. That's why it's good to stick to just one language, C++, >> >>>> and nail >> >>>> the speed. That sets the bar. Then one can try to match the speed >> >>>> with Python, >> >>>> which sometimes is possible. >> >>>> >> >>>>> >> >>>>> With that being said, C++ is itself enough of a complexity cost that >> >>>>> doing this outweighs using it in many (most?) cases. (That's not >> >>>>> just >> >>>>> a knock on C++; using any second language to Python brings a cost, >> >>>>> both because there are now two languages to think about, and because >> >>>>> of interoperability questions) >> >>>> >> >>>> Yes, again the reason why to stick to just one language and only do >> >>>> thin >> >>>> wrappers that allow to use it as a blackbox. >> >>>> >> >>>> >> >>>> >> >>>> On Tue, Apr 22, 2014 at 6:23 PM, Brian Granger <[email protected]> >> >>>> wrote: >> >>>>> I too feel that csympy should implement the absolute minimum >> >>>>> possible >> >>>>> for it to be fast. All actual mathematical algorithms should remain >> >>>>> in >> >>>>> sympy. Trying to pull lots of algorithms into csympy will fail - not >> >>>>> that many people want to write complex mathematical algorithms in >> >>>>> C++. >> >>>> >> >>>> >> >>>> I understand your worries. Both yours and Aaron's and other people in >> >>>> the SymPy community. >> >>>> Let's be frank about this, here they are: >> >>> >> >>> Thanks for your frankness. So I think you do understand the issues. >> >>> >> >>>> >> >>>> * keep things simple, maintainable, in Python, not introducing other >> >>>> languages >> >>>> >> >>>> * especially not C++, which is notorious for being hilariously >> >>>> complex. That's why we use Python, because it's better. >> >>> >> >>> Well, there are also languages that are fast and not C++, but we can >> >>> have that discussion separately. >> >>> >> >>>> >> >>>> * danger of splitting a community by introducing a separate CAS >> >>>> (=wasting our resources, attention, developers, and so on), and we've >> >>>> been on this path before, e.g. with with sympycore, or even Sage --- >> >>>> any improvement to those codes does not benefit SymPy. That does not >> >>>> mean that there is anything bad with those, but one has to try to >> >>>> focus, in our case SymPy, and just get things done, make it a useful >> >>>> library so that people can use it and benefit from it. >> >>>> >> >>>> * that we should concentrate on features, and sacrifice some >> >>>> reasonable speed (maybe 100x or so). >> >>>> >> >>>> * We should not be reimplementing SymPy in C++, that would be a big >> >>>> waste. >> >>> >> >>> One of my worries is not listed here, which is that you are doing >> >>> things completely backwards from good software design with CSymPy, >> >>> which is getting speed first, and something that works later. >> >> >> >> The goal is to get something that works first, polished API later. >> >> By "work" I mean fast and working for PyDy (at first). >> >> >> >>> >> >>>> >> >>>> >> >>>> I thought about all these very deeply. And I can promise that I will >> >>>> do my absolute best to make this work, with the SymPy community. I >> >>>> might not have the best answer to all the worries, but I know that we >> >>>> need to set the bar: >> >>>> >> >>>> * So implement trig functions in C++ >> >>>> * Benchmark against Mathematica, Sage, GiNaC >> >>>> * Make sure it is as fast or faster >> >>>> * See if we can match it with Python >> >>>> * If so, great! If not, what is the penalty? 2x? 10x? 100x? >> >>> >> >>> That is exactly what I want to know. >> >>> >> >>>> >> >>>> If we don't have this bar, then we miss on speed. And if we miss on >> >>>> speed then there will always be reason why people would use other >> >>>> software, because of speed. If, on the other hand, csympy is as fast >> >>>> as state of the art, then it fixes the problem. And it will be >> >>>> integrated with SymPy well, people can keep using SymPy. >> >>>> >> >>>> Ondrej >> >>> >> >>> I feel like without real numbers, we aren't going to get anywhere, so >> >>> maybe you could provide some benchmarks. I'm not convinced about >> >>> expand, because that relies pretty heavily on other things like >> >>> multinomial coefficient generation. I'd rather see a benchmark that >> >>> does the exact same expression manipulations everywhere. >> >>> >> >>> Feel free to suggest a better one. I'm just coming up with this from >> >>> the seat of my pants, but something like >> >>> >> >>> a = x >> >>> c = 1 >> >>> for i in range(1000): # Replace with larger numbers if necessary >> >>> a += c*i*x # If CSymPy expressions are mutable modify this >> >>> accordingly >> >>> c *= -1 >> >> >> >> Sure. Here is the code: >> >> >> >> from csympy import var, Integer >> >> #from sympy import var, Integer >> >> var("x") >> >> a = x >> >> c = Integer(1) >> >> N = 10**5 >> >> for i in range(N): >> >> a += c*i*x >> >> c *= -1 >> >> print a >> >> >> >> >> >> SymPy: >> >> >> >> $ time python a.py >> >> -49999*x >> >> >> >> real 0m35.262s >> >> user 0m34.870s >> >> sys 0m0.300s >> >> >> >> CSymPy: >> >> >> >> $ time python a.py >> >> -49999x >> >> >> >> real 0m0.860s >> >> user 0m0.852s >> >> sys 0m0.004s >> >> >> > >> > Comparing sympy.polys and sympy.core: >> > >> > In [1]: R, x = ring("x", ZZ) >> > >> > In [2]: y = Symbol("y") >> > >> > In [3]: N, a, c = 10**5, x, ZZ(1) >> > >> > In [4]: %time for i in range(N): a += c*i*x; c *= -1 >> > CPU times: user 564 ms, sys: 4.85 ms, total: 569 ms >> > Wall time: 555 ms >> > >> > In [5]: N, a, c = 10**5, y, Integer(1) >> > >> > In [6]: %time for i in range(N): a += c*i*y; c *= -1 >> > CPU times: user 20 s, sys: 133 ms, total: 20.1 s >> > Wall time: 20 s >> > >> >> >> >> So this particular one is 41x faster in CSymPy. You can modify this to >> >> generate some long expressions, e.g.: >> >> >> >> from csympy import var, Integer >> >> #from sympy import var, Integer >> >> var("x") >> >> a = x >> >> c = Integer(1) >> >> N = 3000 >> >> for i in range(N): >> >> a += c*x**i >> >> c *= -1 >> >> >> >> SymPy: >> >> >> >> $ time python a.py >> >> >> >> real 0m37.890s >> >> user 0m37.626s >> >> sys 0m0.152s >> >> >> >> CSymPy: >> >> >> >> $ time python a.py >> >> >> >> real 0m1.032s >> >> user 0m1.020s >> >> sys 0m0.012s >> >> >> > >> > Comparing sympy.polys and sympy.core: >> > >> > In [1]: R, x = ring("x", ZZ) >> > >> > In [2]: y = Symbol("y") >> > >> > In [3]: N, a, c = 3000, x, ZZ(1) >> > >> > In [4]: %time for i in range(N): a += c*x**i; c *= -1 >> > CPU times: user 148 ms, sys: 4.3 ms, total: 152 ms >> > Wall time: 147 ms >> > >> > In [5]: N, a, c = 3000, y, Integer(1) >> > >> > In [6]: %time for i in range(N): a += c*y**i; c *= -1 >> > CPU times: user 20.6 s, sys: 42.6 ms, total: 20.6 s >> > Wall time: 20.6 s >> > >> > So, what's the difference between CSymPy's +=, *=, *, **, etc. >> > operators and SymPy's ones? Are they in-place? What are the underlying >> > data structures? Do they use the same set of rewrite rules? Do they >> > take assumptions into account? When comparing sympy.polys and >> > sympy.core, it's obvious that sympy.polys will be faster because it >> > simply does a lot less compared to sympy.core. >> >> They don't use assumptions. I think assumptions should not be taken >> into account in these manipulations, but rather using >> refine(). >> >> For the sympy.polys in this example, >> is it using the sparse polynomial representation? I.e. it stores the >> symbols (x, y, z) and then stores their powers and coefficients >> in a dictionary, e.g.: >> >> 3*x^2*z + 2*x -> {3: (2, 0, 1), 2: (1, 0, 0)} >> >> ? >> >> In CSymPy, I am still implementing this datastructure, as it is very >> efficient, but it only works for polynomials. So at the moment, we >> can't benchmark this yet. >> >> The internal datastructure for the CSymPy benchmark above is currently >> std::unordered_map inside CSymPy::Add. >> So a better benchmark is something like this then: >> >> %time for i in range(N): a += c*x**(i*x); c *= -1 >> >> which fails with: >> >> TypeError: int() argument must be a string or a number, not 'PolyElement' >> >> if you try to use polynomials. >> >> So in SymPy: >> >> In [1]: from sympy import Symbol, Integer >> >> In [2]: x = Symbol("x") >> >> In [3]: N, a, c = 3000, x, Integer(1) >> >> In [4]: %time for i in range(N): a += c*x**(i*x); c *= -1 >> CPU times: user 25.8 s, sys: 8 ms, total: 25.8 s >> Wall time: 25.8 s >> >> And CSymPy: >> >> In [1]: from csympy import Symbol, Integer >> >> In [2]: x = Symbol("x") >> >> In [3]: N, a, c = 3000, x, Integer(1) >> >> In [4]: %time for i in range(N): a += c*x**(i*x); c *= -1 >> CPU times: user 1.17 s, sys: 0 ns, total: 1.17 s >> Wall time: 1.17 s >> >> >> Ondrej > > -- > You received this message because you are subscribed to the Google Groups > "sympy" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at http://groups.google.com/group/sympy. > To view this discussion on the web visit > https://groups.google.com/d/msgid/sympy/c64643bb-8e68-4fa5-8eee-8ed7f73c10a8%40googlegroups.com. > > For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups "sympy" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/sympy. To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/CADDwiVBEcSo8-fWR9-RbSGoMCORxcZs2RTBnw_AMiCCa1B7QCA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
