Am Mittwoch 17 März 2010 17:01:06 schrieb Gregory Collins: > Vasyl Pasternak <vasyl.paster...@gmail.com> writes: > > Hi Cafe, > > > > Yesterday I played with iteratee package, and wanted to check its > > performance. I tried to count lines in a file, as Oleg in his famous > > lazy_vs_correct[1] article. The results somewhat disappointed me. > > eris:benchmark greg$ time ./IterateeTest /usr/share/dict/words > 234936 > > real 0m0.027s > user 0m0.010s > sys 0m0.015s > eris:benchmark greg$ time ./ByteStringTest /usr/share/dict/words > 234936 > > real 0m0.024s > user 0m0.015s > sys 0m0.007s > > > Note also that the Bytestring I/O functions use a 32KB buffer and the > iteratee "enumFd" function uses a 4KB buffer; if the buffers were the > same the performance would be comparable. Here is my code:
Which gave not so good performance on a large file, it ate ~530MB memory for a 260MB file with 20915399 lines (after the first run, when the other apps' memory had been swapped out, it wasn't awful anymore, just mediocre). It was, however, a good memory citizen - and pretty fast - after putting a BangPattern on go's argument. Times: wc -l: 0.90s your code with the BangPattern: 1.10s Vasyl's Iteratee code: 7.61s Vasyl's ByteString code: 1.40s Code from http://www.haskell.org/haskellwiki/Wc [1]: 0.76s (all Haskell code compiled with -O2 by ghc-6.12.1). I used iteratee-0.3.4 from hackage. So I'd say, if you know how, Iteratees can already be fast, but it seems to be easier to write slow code than with ByteString. [1] The winner: import qualified Data.ByteString.Lazy.Char8 as L main :: IO () main = L.getContents >>= print . L.count '\n' > > ------------------------------------------------------------------------ > > {-# LANGUAGE RankNTypes, ScopedTypeVariables #-} > > import qualified Data.Iteratee.IO.Fd as I > import qualified Data.Iteratee as I > import qualified Data.Iteratee.WrappedByteString as I > import qualified Data.ByteString.Char8 as S > > import System.Environment > import System.IO > > > count :: FilePath -> IO Int > count s = I.fileDriverFd cnt s > > cnt :: (Monad m) => I.IterateeG I.WrappedByteString Char m Int > cnt = go 0 > where > go n = I.IterateeG $ \ch -> > case ch of > (I.EOF Nothing) -> return $ I.Done n ch > (I.EOF (Just e)) -> return $ I.Cont cnt (Just e) > (I.Chunk (I.WrapBS s)) -> do > let n' = n + S.count '\n' s > return $ I.Cont (go n') Nothing > > main :: IO () > main = do > [f] <- getArgs > print =<< count f _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe