[Haskell-cafe] Reasoning about performance

Scott Pakin Tue, 03 Sep 2013 15:32:27 -0700

I'm a Haskell beginner, and I'm baffled trying to reason about code
performance, at least with GHC.  For a program I'm writing I needed to
find all pairs of elements of a list.  That is, given the list "ABCD"
I wanted to wind up with the list
[('A','B'),('A','C'),('A','D'),('B','C'),('B','D'),('C','D')], where
the order of elements in the resulting list is immaterial, but I don't
want both ('A', 'B') and ('B', 'A'), for example.


My first attempt looked like this:

    allPairs1 :: [a] -> [(a, a)]
    allPairs1 []     = []
    allPairs1 (x:xs) = map (\a  -> (x, a)) xs ++ allPairs1 xs

Benchmarking with ghci's ":set +s" reveals the following performance
(median of three runs shown):

    $ ghci
    GHCi, version 7.6.3: http://www.haskell.org/ghc/  :? for help
    Loading package ghc-prim ... linking ... done.
    Loading package integer-gmp ... linking ... done.
    Loading package base ... linking ... done.
    Prelude> :!ghc -c -O2 allpairs.hs
    Prelude> :load allpairs
    Ok, modules loaded: AllPairs.
    Prelude AllPairs> :m +Control.DeepSeq
    Prelude Control.DeepSeq AllPairs> :show modules
    AllPairs         ( allpairs.hs, allpairs.o )
    Prelude Control.DeepSeq AllPairs> :set +s
    Prelude Control.DeepSeq AllPairs> deepseq (allPairs1 [1..10000]) True
    True
    (4.85 secs, 4004173984 bytes)

A colleague suggested getting rid of the list append as follows:

    allPairs2 :: [a] -> [(a, a)]
    allPairs2 [] = []
    allPairs2 (x:xs) = allPairs2' x xs xs
      where allPairs2' x []     []     = []
            allPairs2' x (y:ys) zs     = (x,y) : allPairs2' x ys zs
            allPairs2' x []     (z:zs) = allPairs2' z zs zs

Not surprisingly, that runs faster:

    Prelude Control.DeepSeq AllPairs> deepseq (allPairs2 [1..10000]) True
    True
    (2.14 secs, 4403686376 bytes)

I then figured I could do even better by rewriting the code
tail-recursively:

    allPairs3 :: [a] -> [(a, a)]
    allPairs3 [] = []
    allPairs3 (x:xs) = allPairs3' x xs xs []
      where allPairs3' :: a -> [a] -> [a] -> [(a, a)] -> [(a, a)]
            allPairs3' h (x:xs) ys     result = allPairs3' h xs ys ((h, 
x):result)
            allPairs3' _ []     (y:ys) result = allPairs3' y ys ys result
            allPairs3' _ []     []     result = result

This takes half the memory as the above (yay!) but runs substantially
*slower* (and appears to be thrashing my computer's memory system):

    Prelude Control.DeepSeq AllPairs> deepseq (allPairs3 [1..10000]) True
    True
    (10.58 secs, 2403820152 bytes)

What gives?  Why does my tail-recursive implementation run even slower
and presumably produce more garbage than my initial, naive
implementation?  Is there some optimization GHC is performing for
allPairs1 and allPairs2 that it can't apply to allPairs3?

Similarly, how should I, as a newcomer who wants to write efficient
Haskell code, reason about whether a code change is likely to improve
rather than degrade performance?  Guidelines like "per-iteration
appends = bad" and "tail recursion = good" are great, but the
preceding data indicate that something subtler is taking precedence
over those guidelines with respect to code performance.

Thanks in advance,
-- Scott

_______________________________________________
Haskell-Cafe mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Reasoning about performance

Reply via email to