On Sat, 2007-12-15 at 14:34 +0000, Duncan Coutts wrote: > Ok, I presume this is a guessing game and we're supposed to just look at > the code without running and timing them.
Precisely :) > > All they do is read from stdin and count the number of spaces they see. > > There are two that use strict bytestrings, two that use lazy > > bytestrings, and two that use the standard Haskell strings. Three use a > > recursive function with an accumulator parameter and three use a foldl > > with a lambda function. > > > > Say the fastest one takes the time 1. How much time will the others > > take? > > > > And how about memory? How much memory do you think they require? Let's > > say we feed a 150MB(*) file into each of them, how many megabytes do you > > think they end up using (as seen from the OS, not in terms of how big > > the live heap is)? > > > > I'm going to post full benchmarks + analysis on Wednesday. > > Right'o. I'll have a go. Lets see if I can't embarrass myself with being > completely inaccurate. Thanks for biting! You were, thankfully, only almost completely inaccurate ;) > > PS: For extra credit, what do you think is the peak memory use for this > > program when given an input file of 150MB? > Hmm. So that should work in constant memory, a few 64 chunks at once. > I'd expect this to be pretty fast. You are right about the speed. Can you guess a number in kilobytes? > > > > > > > > ============================== > > hs/space-bs-c8-acc-1.hs: > > {-# LANGUAGE BangPatterns #-} > > > > import qualified Data.ByteString.Char8 as B > > > > cnt :: Int -> B.ByteString -> Int > > cnt !acc bs = if B.null bs > > then acc > > else cnt (if B.head bs == ' ' then acc+1 else acc) (B.tail bs) > > > > main = do s <- B.getContents > > print (cnt 0 s) > > This uses strict bytestrings so will use at least 150Mb and that'll make > it a good deal slower. In fact it'll be worse than that since > getContents does not know in advance how big the input will be so it has > to play the doubling and copying game. So it'll end up copying all the > data roughly twice. cnt is strict and tail recursive so that shouldn't > be any problem, though it's probably not as fast as the first length . > filter since head, tail, null all have to do bounds checks. You are right about the memory. It is actually slightly faster than the "extra credit" (length/filter combination) above. > > ============================== > > hs/space-bslc8-acc-1.hs: > > {-# LANGUAGE BangPatterns #-} > > > > import qualified Data.ByteString.Lazy.Char8 as B > > > > cnt :: Int -> B.ByteString -> Int > > cnt !acc bs = if B.null bs > > then acc > > else cnt (if B.head bs == ' ' then acc+1 else acc) (B.tail bs) > > > > main = do s <- B.getContents > > print (cnt 0 s) > > For the same reason as above, I'd expect this cnt to be slower than > B.length . B.filter (== ' ') It is slower but not for the same reason as above. > > ============================== > > hs/space-xxxxx-acc-1.hs: > > {-# LANGUAGE BangPatterns #-} > > > > cnt :: Int -> String -> Int > > cnt !acc bs = if null bs > > then acc > > else cnt (if head bs == ' ' then acc+1 else acc) (tail bs) > > > > main = do s <- getContents > > print (cnt 0 s) > > Lazy, so constant memory use, but much higher constant factors due to > using String. Spot on. > > ============================== > > hs/space-bs-c8-foldlx-1.hs: > > {-# LANGUAGE BangPatterns #-} > > > > import qualified Data.ByteString.Char8 as B > > > > cnt :: B.ByteString -> Int > > cnt bs = B.foldl' (\sum c -> if c == ' ' then sum+1 else sum) 0 bs > > > > main = do s <- B.getContents > > print (cnt s) > > This is of course still strict so that's going to make the reading slow. Nope. > This is a manually fused B.length . B.filter (== ' ') which hopefully is > the same speed as the automatically fused one if the fusion is working > ok. If not, then the B.length . B.filter (== ' ') will be doing a extra > copy, and memory writes are expensive. > > > ============================== > > hs/space-bslc8-foldlx-1.hs: > > {-# LANGUAGE BangPatterns #-} > > > > import qualified Data.ByteString.Lazy.Char8 as B > > > > cnt :: B.ByteString -> Int > > cnt bs = B.foldl' (\sum c -> if c == ' ' then sum+1 else sum) 0 bs > > > > main = do s <- B.getContents > > print (cnt s) > > As above but now in constant memory space. Nope. > > ============================== > > hs/space-xxxxx-foldl.hs: > > {-# LANGUAGE BangPatterns #-} > > > > cnt :: String -> Int > > cnt bs = foldl (\sum c -> if c == ' ' then sum+1 else sum) 0 bs > > > > main = do s <- getContents > > print (cnt s) > > Oh, no! not foldl that's a killer. You think it's worse than the program just above? > > Ok, so best way to summarise I think is to organise by data type since I > think that'll dominate. > > So I think the lazy bytestring versions will be fastest due to having > the best memory access patterns and doing the least copying. I think the > foldl's will be faster than the explicit accumulators due to having > fewer bounds checks. > > space-bslc8-foldlx-1 > space-bslc8-acc-1 > > space-bs-c8-foldlx-1 > space-bs-c8-acc-1 > > space-xxxxx-acc-1 > space-xxxxx-foldl > > I'll try guessing at some ratios: > > 1.0 space-bslc8-foldlx-1 > 1.1 space-bslc8-acc-1 > > 2.0 space-bs-c8-foldlx-1 > 2.1 space-bs-c8-acc-1 > > 4.0 space-xxxxx-acc-1 > 15 space-xxxxx-foldl I've done the measurements on a 2GHz Athlon64 3000+, a 1667 MHz Core Duo, and a 600MHz Pentium III. They all show the same pattern (with a few minor aberrations for the PIII). You did got one of the relative speeds right ;) (and some of the memory usages) I've tested with ghc 6.8.1 and 6.9.20071119 and 6.9.20071208 (or thereabouts). 6.6.1 won't run my benchmarks and it also won't let me install bytestring-0.9.0.1 to replace its built-in version. -Peter _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe