Re: [Haskell-cafe] Re: Code and Perf. Data for Prime Finders (was: Genuine Eratosthenes sieve)
Hello, On Wednesday 25 July 2007 01:42, Thorkil Naur wrote: Hello Melissa, On Tuesday 24 July 2007 19:09, Melissa O'Neill wrote: ... (See ONeillPrimes.hs in http://www.cs.hmc.edu/~oneill/code/haskell- primes.zip for the complete code. I've also added Thorkil Naur's code from March, which is also a good performer, Do you have detailed measurements that you wish to share? I would be most interested, I assure you. although its another case where most readers would find a detailed explanation of the code instructive.) I'll do my very best to provide such an explanation within, say, the next couple of weeks. ... And now that time has come, so brace yourselves. For your convenience, my code from March is thorkilnaur.dk/~tn/T64_20070303_1819.tar.gz See also a preliminary description in http://www.haskell.org/pipermail/haskell-cafe/2007-March/023095.html. The new explanation is here: thorkilnaur.dk/~tn/Haskell/EratoS/EratoS2.txt Best regards Thorkil ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Code and Perf. Data for Prime Finders (was: Genuine Eratosthenes sieve)
apfelmus wrote: After some pondering, the List a data structure for merging is really ingenious! :) Here's a try to explain how it works: Thanks apfelmus! A detailed explanation of this code is really helpful for anyone trying to understand what is going on. The VIP/ Crowd analogy is also very nice. I think that this approach has the potential to outperform O'Neill's (old?) code because it doesn't incorporates another prime number to the sieve mechanism until it really has to. I mean the following: in the 1st, 2nd, 3rd, 4th, ... step, O'Neill's code adds the multiples 2*2, 3*3, 5*5, 7*7, ... to the priority queue and uses them to sieve for potential prime numbers from then on. Yeah, that's (only) what the code in my paper does -- it's good enough for explicative purposes, but all recent versions have used a slightly augmented priority queue. It's a priority queue coupled with a feeder list that provides entries to add to the queue (in increasing order). They are only admitted to the heap data structure only once when the root of the heap gets there. The two most important bits are: type HybridQ k v = (PriorityQ k v, [(k,v)]) -- postRemoveHQ is called when the min element of the heap has changed postRemoveHQ :: Ord k = HybridQ k v - HybridQ k v postRemoveHQ mq@(pq, []) = mq postRemoveHQ mq@(pq, (qk,qv) : qs) | qk minKeyPQ pq = (insertPQ qk qv pq, qs) | otherwise= mq (See ONeillPrimes.hs in http://www.cs.hmc.edu/~oneill/code/haskell- primes.zip for the complete code. I've also added Thorkil Naur's code from March, which is also a good performer, although its another case where most readers would find a detailed explanation of the code instructive.) the approach here adds 5*5=25 to the heap only after considering the 9th prime 23. Yep, that's what mine does too. Best Regards, Melissa. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Code and Perf. Data for Prime Finders (was: Genuine Eratosthenes sieve)
Hello Melissa, On Tuesday 24 July 2007 19:09, Melissa O'Neill wrote: apfelmus wrote: After some pondering, the List a data structure for merging is really ingenious! :) Here's a try to explain how it works: Thanks apfelmus! A detailed explanation of this code is really helpful for anyone trying to understand what is going on. The VIP/ Crowd analogy is also very nice. I think that this approach has the potential to outperform O'Neill's (old?) code because it doesn't incorporates another prime number to the sieve mechanism until it really has to. I mean the following: in the 1st, 2nd, 3rd, 4th, ... step, O'Neill's code adds the multiples 2*2, 3*3, 5*5, 7*7, ... to the priority queue and uses them to sieve for potential prime numbers from then on. Yeah, that's (only) what the code in my paper does -- it's good enough for explicative purposes, but all recent versions have used a slightly augmented priority queue. It's a priority queue coupled with a feeder list that provides entries to add to the queue (in increasing order). They are only admitted to the heap data structure only once when the root of the heap gets there. The two most important bits are: type HybridQ k v = (PriorityQ k v, [(k,v)]) -- postRemoveHQ is called when the min element of the heap has changed postRemoveHQ :: Ord k = HybridQ k v - HybridQ k v postRemoveHQ mq@(pq, []) = mq postRemoveHQ mq@(pq, (qk,qv) : qs) | qk minKeyPQ pq = (insertPQ qk qv pq, qs) | otherwise= mq (See ONeillPrimes.hs in http://www.cs.hmc.edu/~oneill/code/haskell- primes.zip for the complete code. I've also added Thorkil Naur's code from March, which is also a good performer, Do you have detailed measurements that you wish to share? I would be most interested, I assure you. although its another case where most readers would find a detailed explanation of the code instructive.) I'll do my very best to provide such an explanation within, say, the next couple of weeks. the approach here adds 5*5=25 to the heap only after considering the 9th prime 23. Yep, that's what mine does too. Best Regards, Melissa. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe Thanks a lot and the best regards Thorkil ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Code and Perf. Data for Prime Finders (was: Genuine Eratosthenes sieve)
Hi But for the current version of my code, there is still a bit of a performance gap between our two methods. Here are the stats I get (ghc -O3, 2.4GHz x86): Are you aware that -O3 is slower than -O2 and -O in ghc? If you want fast code then specify -O2, not -O3. Thanks Neil ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Code and Perf. Data for Prime Finders (was: Genuine Eratosthenes sieve)
Neil Mitchell wrote: -O3 is slower than -O2 and -O in ghc? If you want fast code then specify -O2, not -O3. Oops. That ought to have been -O2. But from what I can tell, -O3 is harmless (at least in this case). Both invocations generate identical executables, at least for these examples on my machine. Melissa. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Code and Perf. Data for Prime Finders (was: Genuine Eratosthenes sieve)
OK. Another weird thing is that much of the Haskell code seems to work with Integer whereas the C code uses int. That doesn't seem fair. -- Lennart On Feb 25, 2007, at 02:40 , Melissa O'Neill wrote: Someone asked if I'd include a classic C version of the Sieve in my comparisons. Having done so, Lennart wrote (slightly rephrased): How did you compare the C version with the Haskell versions? The Haskell programs produce the Nth prime, whereas the C code produces the last prime less than M. True. But since I have to know what M is to find the Nth prime, it's easy enough to ask the C code to produce the right prime. To make the C code to what the Haskell code does you need to set some upper bound that is related to the prime number distribution. I see no trace of this in your code. The Haskell versions that go up to a limit do this, so I could easily have written code to do it -- it's not hard, but has no real bearing on the time complexity of the code, so I didn't bother. You could argue that it's cheating to tell it so blatantly when to stop, but I hate the C code I'd found enough that I didn't really want to touch it any more than I had to. A much more legitimate complaint about the comparison with the C code is actually on space usage. It uses much more space than some of the algorithms it's competing with. More about that in an upcoming message. Melissa. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Code and Perf. Data for Prime Finders (was: Genuine Eratosthenes sieve)
For those enjoying the fun with prime finding, I've updated the source at http://www.cs.hmc.edu/~oneill/code/haskell-primes.zip I've tweaked my code a little to improve its space behavior when finding primes up to some limit, added an up-to-limit version of the Naive Primes algorithm, and added Oleg's prime finding code too. I also got a chance to look at space usage more generally. I won't reproduce a table here, but the conclusions were more-or-less what you'd expect. The unlimited list algorithms used O(n) space to find n primes (except for Runciman's algorithm, which appeared to be much worse), and the primes up to a limit algorithms used O(sqrt (n)) space to find the nth prime. Both of these are better than the classic C algorithm, which uses O(n log n) space to find the nth prime. For example, heap profiling shows that my own O(sqrt(n)) algorithm uses only 91200 bytes to find the 10^7th prime, whereas the classic C algorithm needs at least 11214043 bytes for its array -- a factor of more than 100 different, and one that gets worse for larger n. Lennart Augustsson wrote: Another weird thing is that much of the Haskell code seems to work with Integer whereas the C code uses int. Originally, I was comparing Haskell with Haskell, and for that purpose I wanted to have a level playing field, so going with Integer everywhere made sense. That doesn't seem fair. Actually, to the extent that any of the comparisons are fair, I think this one is too. After all, typical Haskell code uses Integer and typical C code uses int. I could use arrays in my Haskell code and never use laziness, but when I program in Haskell, I'm not trying to exactly recreate C programs, but rather write their Haskell equivalents. For example, to me, producing a lazy list was essential for a true Haskell feel. For some people, the Haskell feel also includes treating the language as a declarative specification language where brevity is everything -- but for me, other things (like fundamental algorithmic efficiency and faithfulness to the core ideas that make the Sieve of Eratosthenes an *efficient* algorithm) are universal and ought to be common to both C and Haskell versions. But to allow a better comparison with C, I've added a run for an Int version of my algorithm. With that change, my code is closer to the speed of the C code. More interestingly, for larger n, I seem to be narrowing the gap. At 10^6, my code runs nearly 30 times slower than the classic C version, but at 10^8, I'm only about 20 times slower. This is especially interesting to me there was some (reasonable looking) speculation from apfelmus several days ago, that suggested that my use of a priority queue incurred an extra log(n) overhead, from which you would expect a worse asymptotic complexity, not equivalent or better. Melissa. Enc. (best viewed with a fixed-width font) -- Time (in seconds) for Number of Primes Algorithm 10^310^4 10^5 10^6 10^7 10^8 -- C-Sieve 0.00 0.00 0.01 0.29 5.1288.24 O'Neill (#3) 0.01 0.04 0.55 8.34122.62 1779.18 O'Neill (#2) 0.01 0.06 0.9513.85194.96 2699.61 O'Neill (#1) 0.01 0.07 1.0715.95230.11 - Bromage 0.02 0.39 6.50 142.85 - - sieve (#3) 0.01 0.25 7.28 213.19 - - Naive (#2)0.02 0.5914.70 386.40 - - Naive (#1)0.32 0.6616.04 419.22 - - Runciman 0.02 0.7429.25- - - Reinke0.04 1.2141.00- - - Zilibowitz0.02 2.50 368.33- - - Gale (#1) 0.12 17.99-- - - sieve (#1) 0.16 32.59-- - - sieve (#2) 0.01 32.76-- - - Oleg 0.18 68.40-- - - Gale (#2) 1.36268.65-- - - -- - The dashes in the table mean I gave up waiting (i.e., 500 seconds) - sieve (#1) is the classic example we're all familiar with - sieve (#2) is the classic example, but sieving a list without multiples of 2,3,5, or 7 -- notice how it makes no real difference - sieve (#3) is the classic example, but generating a lazy-but- finite list (see below) - O'Neill (#1) is basically the algorithm of mine discussed in http:// www.cs.hmc.edu/~oneill/papers/Sieve-JFP.pdf, with a few minor tweaks - O'Neill (#2) is a variant of that algorithm that generates a lazy-
Re: [Haskell-cafe] Re: Code and Perf. Data for Prime Finders (was: Genuine Eratosthenes sieve)
Here's another program you can add. It's fairly short and efficient. -- Lennart import System (getArgs) infixr : data StreamInt = !Int : StreamInt (!) :: StreamInt - Int - Int (x : _) ! 0 = x (_ : xs) ! n = xs ! (n-1) -- By replacing lprimes on the next line by '5 : gen 7 4 2' this algorithm -- runs in very little space, but is somewhat slower. primes = 2 : 3 : lprimes where isPrime (p:ps) n = n `rem` p /= 0 (p*p n || isPrime ps n) lprimes = 5 : gen 7 4 2 gen n a b = if isPrime lprimes n then n : gen (n+a) b a else gen (n+a) b a printNthPrime n = print (n, primes ! (n-1)) main = do args - getArgs printNthPrime $ read $ head args On Feb 25, 2007, at 12:51 , Melissa O'Neill wrote: For those enjoying the fun with prime finding, I've updated the source at http://www.cs.hmc.edu/~oneill/code/haskell-primes.zip I've tweaked my code a little to improve its space behavior when finding primes up to some limit, added an up-to-limit version of the Naive Primes algorithm, and added Oleg's prime finding code too. I also got a chance to look at space usage more generally. I won't reproduce a table here, but the conclusions were more-or-less what you'd expect. The unlimited list algorithms used O(n) space to find n primes (except for Runciman's algorithm, which appeared to be much worse), and the primes up to a limit algorithms used O (sqrt(n)) space to find the nth prime. Both of these are better than the classic C algorithm, which uses O (n log n) space to find the nth prime. For example, heap profiling shows that my own O(sqrt(n)) algorithm uses only 91200 bytes to find the 10^7th prime, whereas the classic C algorithm needs at least 11214043 bytes for its array -- a factor of more than 100 different, and one that gets worse for larger n. Lennart Augustsson wrote: Another weird thing is that much of the Haskell code seems to work with Integer whereas the C code uses int. Originally, I was comparing Haskell with Haskell, and for that purpose I wanted to have a level playing field, so going with Integer everywhere made sense. That doesn't seem fair. Actually, to the extent that any of the comparisons are fair, I think this one is too. After all, typical Haskell code uses Integer and typical C code uses int. I could use arrays in my Haskell code and never use laziness, but when I program in Haskell, I'm not trying to exactly recreate C programs, but rather write their Haskell equivalents. For example, to me, producing a lazy list was essential for a true Haskell feel. For some people, the Haskell feel also includes treating the language as a declarative specification language where brevity is everything -- but for me, other things (like fundamental algorithmic efficiency and faithfulness to the core ideas that make the Sieve of Eratosthenes an *efficient* algorithm) are universal and ought to be common to both C and Haskell versions. But to allow a better comparison with C, I've added a run for an Int version of my algorithm. With that change, my code is closer to the speed of the C code. More interestingly, for larger n, I seem to be narrowing the gap. At 10^6, my code runs nearly 30 times slower than the classic C version, but at 10^8, I'm only about 20 times slower. This is especially interesting to me there was some (reasonable looking) speculation from apfelmus several days ago, that suggested that my use of a priority queue incurred an extra log(n) overhead, from which you would expect a worse asymptotic complexity, not equivalent or better. Melissa. Enc. (best viewed with a fixed-width font) -- Time (in seconds) for Number of Primes Algorithm 10^310^4 10^5 10^6 10^7 10^8 -- C-Sieve 0.00 0.00 0.01 0.29 5.1288.24 O'Neill (#3) 0.01 0.04 0.55 8.34122.62 1779.18 O'Neill (#2) 0.01 0.06 0.9513.85194.96 2699.61 O'Neill (#1) 0.01 0.07 1.0715.95230.11 - Bromage 0.02 0.39 6.50 142.85 - - sieve (#3) 0.01 0.25 7.28 213.19 - - Naive (#2)0.02 0.5914.70 386.40 - - Naive (#1)0.32 0.6616.04 419.22 - - Runciman 0.02 0.7429.25- - - Reinke0.04 1.2141.00- - - Zilibowitz0.02 2.50 368.33- - - Gale (#1) 0.12 17.99-- - - sieve (#1) 0.16 32.59-- - - sieve (#2) 0.01 32.76-- - - Oleg 0.18
Re: [Haskell-cafe] Re: Code and Perf. Data for Prime Finders (was: Genuine Eratosthenes sieve)
G'day all. This one is pretty elegant. A Pritchard sieve is actually an Eratosthenes sieve with the loops reversed. Unfortunately, it's a bit slower. Maybe someone else can speed it up a bit. mergeRemove :: [Integer] - [Integer] - [Integer] mergeRemove [] ys = [] mergeRemove xs [] = xs mergeRemove xs'@(x:xs) ys'@(y:ys) = case compare x y of LT - x : mergeRemove xs ys' EQ - mergeRemove xs ys GT - mergeRemove xs' ys pritchardSieve :: Integer - [Integer] pritchardSieve n | n = 16 = takeWhile (=n) [2,3,5,7,11,13] | otherwise = removeComposites [2..n] (sieve [2..n`div`2]) where removeComposites ps [] = ps removeComposites ps (cs@(c:_):css) = removeComposites' ps where removeComposites' [] = [] removeComposites' (p:ps) | p c = p : removeComposites' ps | otherwise = removeComposites (mergeRemove ps cs) css pjs = pritchardSieve sn sn = isqrt n sieve [] = [] sieve (f:fs) = composites pjs : sieve fs where composites [] = [] composites (p:ps) | pf n || f `mod` p == 0 = [pf] | otherwise = pf : composites ps where pf = p*f Cheers, Andrew Bromage ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Code and Perf. Data for Prime Finders (was: Genuine Eratosthenes sieve)
Someone asked if I'd include a classic C version of the Sieve in my comparisons. Having done so, Lennart wrote (slightly rephrased): How did you compare the C version with the Haskell versions? The Haskell programs produce the Nth prime, whereas the C code produces the last prime less than M. True. But since I have to know what M is to find the Nth prime, it's easy enough to ask the C code to produce the right prime. To make the C code to what the Haskell code does you need to set some upper bound that is related to the prime number distribution. I see no trace of this in your code. The Haskell versions that go up to a limit do this, so I could easily have written code to do it -- it's not hard, but has no real bearing on the time complexity of the code, so I didn't bother. You could argue that it's cheating to tell it so blatantly when to stop, but I hate the C code I'd found enough that I didn't really want to touch it any more than I had to. A much more legitimate complaint about the comparison with the C code is actually on space usage. It uses much more space than some of the algorithms it's competing with. More about that in an upcoming message. Melissa. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Code and Perf. Data for Prime Finders (was: Genuine Eratosthenes sieve)
G'day all. Quoting Melissa O'Neill [EMAIL PROTECTED]: Cool, thanks. When I ran your code trying to find the 10,000th prime, I got AtkinSieveTest: Ix{Integer}.index: Index (36213) out of range ((0,36212)) but that went away when I made your array one bigger. Fixed, thanks. Cheers, Andrew Bromage ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe