[Haskell-cafe] Re: Code and Perf. Data for Prime Finders (was: Genuine Eratosthenes sieve)

Dave Bayer Sun, 22 Jul 2007 16:54:09 -0700

Here is another prime sieve. It is about half the length of thefastest contributed code (ONeillPrimes) and nearly as fast until itblows up on garbage collection:

% cat ONeillPrimes.hs | grep -v "^--" | wc
     185    1105    6306
% cat BayerPrimes.hs  | grep -v "^--" | wc
      85     566    2418


[Integer] -O   1*10^6   2*10^6   3*10^6   4*10^6   5*10^6
---------------------------------------------------------
ONeillPrimes |  3.555 |  7.798 | 12.622 | 18.927 | 23.529
BayerPrimes  |  3.999 |  8.895 | 18.003 | 22.977 | 38.053

I wrote this as an experiment in coding data structures in Haskell'slazy evaluation model, rather than as explicit data. The majority ofthe work done by this code is done by "merge"; the multiples of eachprime percolate up through a tournament consisting of a balanced treeof suspended "merge" function calls. In order to create an infinitelazy balanced tree of lists, the datatype

data List a = A a (List a) | B [a]

is used as scaffolding. One thinks of the root of the infinite treeas starting at the leftmost child, and crawling up the left spine asnecessary.


What I'm calling a "venturi"

venturi :: Ord a => [[a]] -> [a]

merges an infinite list of infinite lists into one list, under theassumption that each list, and the heads of the lists, are inincreasing order. This could be a generally useful function. If onecan think of a better way to write "venturi", swapping in your codewould in particular yield a faster prime sieve.

I found that a tertiary merge tree was faster than a binary mergetree, because this leads to fewer suspensions. One can speed thiscode up a bit by interleaving strict and lazy calls, but I prefer toleave the code short and readable.


It is bizarre that

trim p = let f m x = mod x m /= 0 in filter (f p)

lurks in the prime sieve code, but it is only used with primes ofsize up to the square root of the largest output prime. I tried morethoughtful alternatives, and they all slowed down the sieve.Sometimes dumb is beautiful.

Thanks to apfelmus for various helpful remarks that lead me to thinkof this approach.


Here is the code:

module BayerPrimes (venturi,primes) where

-- Code for venturi :: Ord a => [[a]] -> [a]

merge :: Ord a => [a] -> [a] -> [a] -> [a]
merge xs@(x:xt) ys@(y:yt) zs@(z:zt)
    | x <= y = if x <= z
        then x : (merge xt ys zs)
        else z : (merge xs ys zt)
    | otherwise = if y <= z
        then y : (merge xs yt zs)
        else z : (merge xs ys zt)
merge _ _ _ = undefined

data List a = A a (List a) | B [a]

mergeA :: Ord a => List a -> List a -> List a -> List a
mergeA (A x xt) ys zs = A x (mergeA xt ys zs)
mergeA (B xs)   ys zs = mergeB xs ys zs

mergeB :: Ord a => [a] -> List a -> List a -> List a
mergeB xs@(x:xt) ys@(A y yt) zs = case compare x y of
    LT -> A x (mergeB xt ys zs)
    EQ -> A x (mergeB xt yt zs)
    GT -> A y (mergeB xs yt zs)
mergeB xs (B ys) zs = mergeC xs ys zs
mergeB _ _ _ = undefined

mergeC :: Ord a => [a] -> [a] -> List a -> List a
mergeC xs@(x:xt) ys@(y:yt) zs@(A z zt)
    | x < y = if x < z
        then A x (mergeC xt ys zs)
        else A z (mergeC xs ys zt)
    | otherwise = if y < z
        then A y (mergeC xs yt zs)
        else A z (mergeC xs ys zt)
mergeC xs ys (B zs) = B $ merge xs ys zs
mergeC _ _ _ = undefined

root :: Ord a => List a -> [List a] -> [a]
root (A x xt) yss       = x : (root xt yss)
root (B xs) (ys:zs:yst) = root (mergeB xs ys zs) yst
root _ _ = undefined

wrap :: [a] -> List a
wrap [] = B []
wrap (x:xt) = A x $ B xt

triple :: Ord a => [List a] -> [List a]
triple (x:y:z:xs) = mergeA x y z : (triple xs)
triple _ = undefined

group :: Ord a => [List a] -> [List a]
group (x:y:xt) = x : y : (group $ triple xt)
group _ = undefined

venturi :: Ord a => [[a]] -> [a]
venturi (x:xt) = root (wrap x) $ group $ map wrap xt
venturi _ = undefined

-- Code for primes :: Integral a => [a]

diff  :: Ord a => [a] -> [a] -> [a]
diff xs@(x:xt) ys@(y:yt) = case compare x y of
    LT -> x : (diff  xt ys)
    EQ ->     (diff  xt yt)
    GT ->     (diff  xs yt)
diff _ _ = undefined

trim :: Integral a => a -> [a] -> [a]
trim p = let f m x = mod x m /= 0 in filter (f p)

seed :: Integral a => [a]
seed = [2,3,5,7,11,13,17]

wheel :: Integral a => [a]
wheel = drop 1 [ m*j + k | j <- [0..], k <- ws ]
    where m  = foldr1 (*) seed
          ws = foldr trim [1..m] seed

multiples :: Integral a => [a] -> [[a]]
multiples ws = map fst $ tail $ iterate g ([], ws)
    where g (_,ps@(p:pt)) = ([ m*p | m <- ps ], trim p pt)
          g _ = undefined

primes :: Integral a => [a]
primes = seed ++ (diff wheel $ venturi $ multiples wheel)


_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Re: Code and Perf. Data for Prime Finders (was: Genuine Eratosthenes sieve)

Reply via email to