As an exercise, trying to understand the beautiful paper

        Stream Fusion
        From Lists to Streams to Nothing at All
        Duncan Coutts, Roman Leshchinskiy and Don Stewart
        http://www.cse.unsw.edu.au/~dons/papers/CLS07.html
        http://www.cse.unsw.edu.au/~dons/streams.html

I recoded my prime sieve using a pared down version of their Stream datatype; this is the simplest version I could write that achieves a significant speedup.

My reaction to their paper was, if streams are better internally than lists, why not code directly in streams? Lists enjoy a serious notational advantage in Haskell, but one could imagine a language where the list notation was reserved for stream semantics.

My sieve was spending half its time in "merge", so I made only the changes necessary to convert "merge" to use streams. My streams are infinite, and "merge" can be written to not use Skip, so Step goes away.

Even though "nextx" and "nexty" only have one case now, using case statements is significantly faster than using let or where clauses. I'm imagining that I read about this somewhere, but if I did, it didn't sink in until I was tuning this code. I don't know if this is related to fusion optimization, or a general effect.

The timings are

[Integer] -O2  1*10^6   2*10^6   3*10^6   4*10^6   5*10^6
---------------------------------------------------------
ONeillPrimes |  3.338 |  7.320 | 11.911 | 18.225 | 21.785
StreamPrimes |  3.867 |  8.405 | 13.656 | 21.542 | 37.640
BayerPrimes  |  3.960 |  8.940 | 18.528 | 33.221 | 38.568

Here is the code:

{-# OPTIONS_GHC -fglasgow-exts #-}

module StreamPrimes (primes) where

-- stream code

data Stream a = forall s. Stream (s -> (a,s)) s
data AStream a = A a (AStream a) | B (Stream a)

stream :: [a] -> Stream a
stream xs = Stream next xs
  where
    next []     = undefined
    next (x:xt) = (x,xt)

astream :: [a] -> AStream a
astream []     = undefined
astream (x:xt) = A x $ B $ stream xt

merge :: Ord a => Stream a -> Stream a -> Stream a
merge (Stream nextx vs) (Stream nexty ws) =
    Stream next (vt,ws,Left v)
    where
        (v,vt) = nextx vs
        next (xs,ys,Left x) =
            case nexty ys of
                (y,yt) ->
                    if   x < y
                    then (x,(xs,yt,Right y))
                    else (y,(xs,yt,Left x))
        next (xs,ys,Right y) =
            case nextx xs of
                (x,xt) ->
                    if   x < y
                    then (x,(xt,ys,Right y))
                    else (y,(xt,ys,Left x))

mergeA :: Ord a => AStream a -> AStream a -> AStream a
mergeA (A x xt) ys = A x (mergeA xt ys)
mergeA (B xs)   ys = mergeB xs ys

mergeB :: Ord a => Stream a -> AStream a -> AStream a
mergeB s@(Stream next xs) ys@(A y yt) =
    case next xs of
        (x,xt) ->
            if x < y
            then A x (mergeB (Stream next xt) ys)
            else A y (mergeB s yt)
mergeB xs (B ys) = B $ merge xs ys

-- Code for venturi :: Ord a => [[a]] -> [a]

root :: Ord a => AStream a -> [AStream a] -> [a]
root (A x xt) yss    = x : (root xt yss)
root (B xs) (ys:yst) = root (mergeB xs ys) yst
root _ _ = undefined

pair :: Ord a => [AStream a] -> [AStream a]
pair (x:y:xt) = mergeA x y : (pair xt)
pair _ = undefined

group :: Ord a => [AStream a] -> [AStream a]
group (x:xt) = x : (group $ pair xt)
group _ = undefined

venturi :: Ord a => [[a]] -> [a]
venturi (x:xt) = root (astream x) $ group $ map astream xt
venturi _ = undefined

-- Code for primes :: Integral a => [a]

diff  :: Ord a => [a] -> [a] -> [a]
diff xs@(x:xt) ys@(y:yt) = case compare x y of
    LT -> x : (diff  xt ys)
    EQ ->     (diff  xt yt)
    GT ->     (diff  xs yt)
diff _ _ = undefined

trim :: Integral a => a -> [a] -> [a]
trim p = let f m x = mod x m /= 0 in filter (f p)

seed :: Integral a => [a]
seed = [2,3,5,7,11,13,17]

wheel :: Integral a => [a]
wheel = drop 1 [ m*j + k | j <- [0..], k <- ws ]
    where m  = foldr1 (*) seed
          ws = foldr trim [1..m] seed

multiples :: Integral a => [a] -> [[a]]
multiples ws = map fst $ tail $ iterate g ([], ws)
    where g (_,ps@(p:pt)) = ([ m*p | m <- ps ], trim p pt)
          g _ = undefined

primes :: Integral a => [a]
primes = seed ++ (diff wheel $ venturi $ multiples wheel)


_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Reply via email to