Comment #1 on issue 3950 by [email protected]: Speed of our prime algorithms
http://code.google.com/p/sympy/issues/detail?id=3950

RWH's numpy version is very fast and uses less memory than the others. His pure Python version is also quite fast. Both run circles around the current SymPy implementation, noting that SymPy is using a generator so it isn't entirely apples to apples.

On my computer, counting primes to 800 million:

  0.006s  C, Lehmer's Method, uses < 64k of RAM
  0.03s   C, segmented sieve + ~400 table entries, under 64k
  0.27s   C, segmented sieve, uses ~128k
  0.39s   Perl, Lehmer's method, ~2MB
  2.9s    C, 4-lines, monolithic, ~50MB
 54.8s    Perl, hacked up segmented string sieve, ~30MB
 57.5s    Perl, monolithic string sieve, ~1200MB (ouch)

  2.8s    Python, RWH numpy, uses ~270MB of RAM
 14.0s    Python, RWH pure, uses ~2500MB (ouch)
184.4s    Python, SymPy.MPMATH.primepi, 25800MB (OMG)
291.4s    Python, SymPy.primepi, 12500 MB (Yeouch)

Lehmer's method is "cheating" though perhaps should be considered (or LMO is someone is ambitious) for primepi. It is far faster than sieving. It turns out that a very reasonable number of table entries combined with a segmented sieve can get a lot of speedup for counting in the small ranges (e.g. under 10 billion or so). Anyway, these are really just for prime count and nth prime, rather than generating primes, though SymPy could use some serious help there (Math::Prime::Util can give the 50,000,000,000th prime in about 1 second, without any tables, though all the hard work is in C)

The memory use for the current methods is crazy high. If we want to keep a generator that remembers everything from 0 to <limit> then we've immediately thrown out hope of efficiently answering things like "primes between 10^15 and 10^15+1000000". Memory use becomes quite important. Numpy can do this efficiently, with 15 bytes per 30 or 8 bytes per 30. It wasn't clear to me how to get pure Python to do this with clean efficient code, but I'm not a Python expert.

The possibility I personally like is to toss out the monolithic generator and go with a segmented sieve, which makes memory use basically fixed at whatever reasonable size you choose. It can still use a generator inside the segment.

--
You received this message because this project is configured to send all issue notifications to this address.
You may adjust your notification preferences at:
https://code.google.com/hosting/settings

--
You received this message because you are subscribed to the Google Groups 
"sympy-issues" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/sympy-issues.
For more options, visit https://groups.google.com/groups/opt_out.


Reply via email to