Re: Issue 3950 in sympy: Speed of our prime algorithms

sympy Fri, 19 Jul 2013 20:57:28 -0700

Comment #1 on issue 3950 by [email protected]: Speed of our primealgorithms

http://code.google.com/p/sympy/issues/detail?id=3950

RWH's numpy version is very fast and uses less memory than the others. Hispure Python version is also quite fast. Both run circles around thecurrent SymPy implementation, noting that SymPy is using a generator so itisn't entirely apples to apples.


On my computer, counting primes to 800 million:

  0.006s  C, Lehmer's Method, uses < 64k of RAM
  0.03s   C, segmented sieve + ~400 table entries, under 64k
  0.27s   C, segmented sieve, uses ~128k
  0.39s   Perl, Lehmer's method, ~2MB
  2.9s    C, 4-lines, monolithic, ~50MB
 54.8s    Perl, hacked up segmented string sieve, ~30MB
 57.5s    Perl, monolithic string sieve, ~1200MB (ouch)

  2.8s    Python, RWH numpy, uses ~270MB of RAM
 14.0s    Python, RWH pure, uses ~2500MB (ouch)
184.4s    Python, SymPy.MPMATH.primepi, 25800MB (OMG)
291.4s    Python, SymPy.primepi, 12500 MB (Yeouch)

Lehmer's method is "cheating" though perhaps should be considered (or LMOis someone is ambitious) for primepi. It is far faster than sieving. Itturns out that a very reasonable number of table entries combined with asegmented sieve can get a lot of speedup for counting in the small ranges(e.g. under 10 billion or so). Anyway, these are really just for primecount and nth prime, rather than generating primes, though SymPy could usesome serious help there (Math::Prime::Util can give the 50,000,000,000thprime in about 1 second, without any tables, though all the hard work is inC)

The memory use for the current methods is crazy high. If we want to keep agenerator that remembers everything from 0 to <limit> then we'veimmediately thrown out hope of efficiently answering things like "primesbetween 10^15 and 10^15+1000000". Memory use becomes quite important.Numpy can do this efficiently, with 15 bytes per 30 or 8 bytes per 30. Itwasn't clear to me how to get pure Python to do this with clean efficientcode, but I'm not a Python expert.

The possibility I personally like is to toss out the monolithic generatorand go with a segmented sieve, which makes memory use basically fixed atwhatever reasonable size you choose. It can still use a generator insidethe segment.

--

You received this message because this project is configured to send allissue notifications to this address.

You may adjust your notification preferences at:
https://code.google.com/hosting/settings

--
You received this message because you are subscribed to the Google Groups 
"sympy-issues" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/sympy-issues.
For more options, visit https://groups.google.com/groups/opt_out.

Re: Issue 3950 in sympy: Speed of our prime algorithms

Reply via email to