On 03/19/2013 10:32 PM, Don Stewart wrote:
Oh, I forgot the technique of inlining the lazy bytestring chunks, and
processing each chunk seperately.

$ time ./fast
4166680
./fast  1.25s user 0.07s system 99% cpu 1.325 total

Essentially inline Lazy.foldlChunks and specializes is (the inliner
should really get that).
And now we have a nice unboxed inner loop, which llvm might spot:

$ ghc -O2 -funbox-strict-fields fast.hs  --make -fllvm
$ time ./fast
4166680
./fast  1.07s user 0.06s system 98% cpu *1.146 total*

So about 8x faster. Waiting for some non-lazy bytestring benchmarks... :)

Thanks Don, but after some investigation I came to conclusion that problem is in State monad

{-# LANGUAGE BangPatterns #-}

import Control.Monad.State.Strict

data S6 = S6 !Int !Int

main_6 = do
    let r = evalState go (S6 10000 0)
    print r
  where
    go = do
        (S6 i a) <- get
        if (i == 0) then return a else (put (S6 (i - 1) (a + i))) >> go

main_7 = do
    let r = go (S6 10000 0)
    print r
  where
    go (S6 i a)
        | i == 0 = a
        | otherwise = go $ S6 (i - 1) (a + i)

main = main_6

main_6 doing constant allocations while main_7 run in constant space. Can you suggest something that improve situation? I don't want to manually unfold all my code that I want to be fast :(.

_______________________________________________
Haskell-Cafe mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/haskell-cafe

Reply via email to