[sympy] Re: Is Sympy feasible for large numbers of variable manipulations?

Kevin Hunter Thu, 17 Jun 2010 07:18:36 -0700

Och, sorry for the delay I didn't get an email that you'd responded.
Gotta love corporate filters ...


> Is this related to cylindrical algebraic decomposition?

No.  The framework could perhaps be applied to problems in that area,
but we are not specifically doing CAD of this nature.

> What do you mean by reduction? Is it mainly cancelation of things like 2*x -
> x and x**2/x, or is there more involved?

Hah!  That's a great question.  :-)  I'm fairly new to the code base,
but that appears to be a good rough summation.  We do a couple things
more, but they're specific to us, and nothing that couldn't use some
cleanup based on good object use and general outsourcing.

> A slightly better way to do this is to send the list straight to Add.

I was wondering if I was overlooking something like this.  As
efficient as list-processing is, it *still* incurs an overhead of an
added function call and associated bookkeeping.  Without looking
deeper, I'll bet that's a good chunk of the difference between these
two paths.  This certainly does drop run-time costs tremendously.  I
also wonder if Add sets up a lazy action?  I note that my timing
methods claim that the Add function returns in half a second (0.5s),
but the printing stage takes much longer.  For reference, here's what
I'm now using:

<code language="python">
#!/usr/bin/python

from time import time
import sys

from sympy import *

def sum_it ( **kwargs ):
   size = int( kwargs.pop( 'size', 30 ) )

   Xvars = [ Symbol( 'X%i' % i ) for i in xrange(size) ]
   Yvars = [ Symbol( 'Y%i' % i ) for i in xrange(size) ]
   mysum = lambda x, y: x + y
   start = time()
   #answer = reduce( mysum, (x * y for x in Xvars for y in Yvars) )
   answer = Add( *[x * y for x in Xvars for y in Yvars] )
   duration = time() - start
   return (answer, duration)

size = sys.argv[1]
print 'Running with size %d ...' % int( size )

start = time()
result, calc_duration = sum_it( size=size )
print result
print_duration = time() - start - calc_duration

print "Calculation: %s seconds." % str( calc_duration )
print "Printing:    %s seconds." % str( print_duration )
</code>

> #NOTE, the first time did take a long time as it did for you, but I think it \
> was much faster after that due to caching.

What kind of caching?  After running this once or twice, the Sympy
code should be well primed in the OS cache.  The results to which I
alluded were consistent over multiple runs.  Is there something more
to which you're referring?

> I think the list comprehension is probably the best way to do that bit,
> except maybe using a generator instead of a list comprehension (i.e., (x*y
> for x in Xvars for y in Yvars) instead of [x*y for x in Xvars for y in
> Yvars]).

Good point.  We of course do this, I just forgot to do so for the
example I gave.

> You do realize that the size of this list increases quadratically with size?

Heh, of course.  These were just my first forays into Sympy with a
calculation that closely parallels what our framework already handles
for 10^6+ calculations.

> As for speed, if the expressions you are dealing with are only polynomial in
> nature, you might achieve better speed by installing gmpy, running sympy with
> the SYMPY_GROUND_TYPES=gmpy set in bash, and using Polys.  Note that this
> gmpy speed up is only available in our development branch for the moment.

> Polys might not be faster for multiple variables at the moment because they
> use a dense multivariate representation based on nested lists, but Mateusz
> has promised to merge in a sparse dense representation based on dictionaries
> that will be faster.

Gotcha.  I'll have to play around with both of those.  Good pointers.

> However, you will have to try it [with your use-cases].

It's amazing how this principle of "try it for yourself" prevails.
Noobs, like myself with your library, always want a simple answer to
"What's the best way to do this?"  The best answers are often,
frustratingly, "it depends."  :-)

> There is also support in the Polys, in master, for cython.  I am not too sure
> how to get this working, so someone else will have to help there, but it
> should provide an additional speed up.

Noted.  I'm more worried about algorithmic efficiency (i.e. O(...) ),
than I am by direct speedup.  It depends on with whom you talk, but I
tend to fall into the camp of viewing this kind low-hanging fruit
(Python -> C) as a one-time gain, not really a "true" measure of
success.  But, as always context is everything.

Thanks much for your answers.  You've already given me some research
tasks.

-- 
You received this message because you are subscribed to the Google Groups 
"sympy" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/sympy?hl=en.

[sympy] Re: Is Sympy feasible for large numbers of variable manipulations?

Reply via email to