memory slop (was: Using the GHC heap profiler)
On Mon, Mar 21, 2011 at 9:59 AM, I wrote: My question on the ghc heap profiler on stack overflow: http://stackoverflow.com/questions/5306717/how-should-i-interpret-the-output-of-the-ghc-heap-profiler remains unanswered :-( Perhaps that's not the best forum. Is there someone here prepared to explain how the memory usage in the heap profiler relates to the Live Bytes count shown in the garbage collection statistics? I've made a little progress on this. I've simplified my program down to a simple executable that loads a bunch of data into an in-memory map, and then writes it out again. I've added calls to `seq` to ensure that laziness is not causing excessing memory consumption. When I run this on my sample data set, it takes ~7 cpu seconds, and uses ~120 MB of vm An equivalent python script, takes ~2 secs and ~19MB of vm :-(. The code is below. I'm mostly concerned with the memory usage rather than performance at this stage. What is interesting, is that when I turn on garbage collection statistics (+RTS -s), I see this: 10,089,324,996 bytes allocated in the heap 201,018,116 bytes copied during GC 12,153,592 bytes maximum residency (8 sample(s)) 59,325,408 bytes maximum slop 114 MB total memory in use (1 MB lost due to fragmentation) Generation 0: 19226 collections, 0 parallel, 1.59s, 1.64selapsed Generation 1: 8 collections, 0 parallel, 0.04s, 0.04selapsed INIT time0.00s ( 0.00s elapsed) MUT time5.84s ( 5.96s elapsed) GCtime1.63s ( 1.68s elapsed) EXIT time0.00s ( 0.00s elapsed) Total time7.47s ( 7.64s elapsed) %GC time 21.8% (22.0% elapsed) Alloc rate1,726,702,840 bytes per MUT second Productivity 78.2% of total user, 76.5% of total elapsed This seems strange. The maximum residency of 12MB sounds about correct for my data. But what's with the 59MB of slop? According to the ghc docs: | The bytes maximum slop tells you the most space that is ever wasted | due to the way GHC allocates memory in blocks. Slop is memory at the | end of a block that was wasted. There's no way to control this; we | just like to see how much memory is being lost this way. There's this page also: http://hackage.haskell.org/trac/ghc/wiki/Commentary/Rts/Storage/Slop but it doesn't really make things clearer for me. Is the slop number above likely to be a significant contribution to net memory usage? Are there any obvious reasons why the code below could be generating so much? The data file in question has 61k lines, and is 6MB in total. Thanks, Tim Map2.hs module Main where import qualified Data.Map as Map import qualified Data.ByteString.Char8 as BS import System.Environment import System.IO type MyMap = Map.Map BS.ByteString BS.ByteString foldLines :: (a - String - a) - a - Handle - IO a foldLines f a h = do eof - hIsEOF h if eof then (return a) else do l - hGetLine h let a' = f a l a' `seq` foldLines f a' h undumpFile :: FilePath - IO MyMap undumpFile path = do h - openFile path ReadMode m - foldLines addv Map.empty h hClose h return m where addv m = m addv m s = let (k,v) = readKV s in k `seq` v `seq` Map.insert k v m readKV s = let (ks,vs) = read s in (BS.pack ks, BS.pack vs) dump :: [(BS.ByteString,BS.ByteString)] - IO () dump vs = mapM_ putV vs where putV (k,v) = putStrLn (show (BS.unpack k, BS.unpack v)) main :: IO () main = do args - getArgs case args of [path] - do v - undumpFile path dump (Map.toList v) return () ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: memory slop (was: Using the GHC heap profiler)
On Wednesday 23 March 2011 03:32:16, Tim Docker wrote: On Mon, Mar 21, 2011 at 9:59 AM, I wrote: My question on the ghc heap profiler on stack overflow: http://stackoverflow.com/questions/5306717/how-should-i-interpret-the- output-of-the-ghc-heap-profiler remains unanswered :-( Perhaps that's not the best forum. Is there someone here prepared to explain how the memory usage in the heap profiler relates to the Live Bytes count shown in the garbage collection statistics? I've made a little progress on this. I've simplified my program down to a simple executable that loads a bunch of data into an in-memory map, and then writes it out again. I've added calls to `seq` to ensure that laziness is not causing excessing memory consumption. When I run this on my sample data set, it takes ~7 cpu seconds, and uses ~120 MB of vm An equivalent python script, takes ~2 secs and ~19MB of vm :-(. The code is below. I'm mostly concerned with the memory usage rather than performance at this stage. What is interesting, is that when I turn on garbage collection statistics (+RTS -s), I see this: 10,089,324,996 bytes allocated in the heap 201,018,116 bytes copied during GC 12,153,592 bytes maximum residency (8 sample(s)) 59,325,408 bytes maximum slop 114 MB total memory in use (1 MB lost due to fragmentation) Generation 0: 19226 collections, 0 parallel, 1.59s, 1.64selapsed Generation 1: 8 collections, 0 parallel, 0.04s, 0.04selapsed INIT time0.00s ( 0.00s elapsed) MUT time5.84s ( 5.96s elapsed) GCtime1.63s ( 1.68s elapsed) EXIT time0.00s ( 0.00s elapsed) Total time7.47s ( 7.64s elapsed) %GC time 21.8% (22.0% elapsed) Alloc rate1,726,702,840 bytes per MUT second Productivity 78.2% of total user, 76.5% of total elapsed This seems strange. The maximum residency of 12MB sounds about correct for my data. But what's with the 59MB of slop? According to the ghc docs: | The bytes maximum slop tells you the most space that is ever wasted | due to the way GHC allocates memory in blocks. Slop is memory at the | end of a block that was wasted. There's no way to control this; we | just like to see how much memory is being lost this way. There's this page also: http://hackage.haskell.org/trac/ghc/wiki/Commentary/Rts/Storage/Slop but it doesn't really make things clearer for me. Is the slop number above likely to be a significant contribution to net memory usage? Yes, absolutely. Are there any obvious reasons why the code below could be generating so much? I suspect packing a lot of presumably relatively short ByteStrings would generate (the lion's share of) the slop. I'm not familiar with the internals, though, so I don't know where GHC would put a newPinnedByteArray# (which is where your ByteString contents is), what alignement requirements those have. The data file in question has 61k lines, and is 6MB in total. Thanks, Tim Map2.hs module Main where import qualified Data.Map as Map import qualified Data.ByteString.Char8 as BS import System.Environment import System.IO type MyMap = Map.Map BS.ByteString BS.ByteString foldLines :: (a - String - a) - a - Handle - IO a foldLines f a h = do eof - hIsEOF h if eof then (return a) else do l - hGetLine h let a' = f a l a' `seq` foldLines f a' h undumpFile :: FilePath - IO MyMap undumpFile path = do h - openFile path ReadMode m - foldLines addv Map.empty h hClose h return m where addv m = m addv m s = let (k,v) = readKV s in k `seq` v `seq` Map.insert k v m readKV s = let (ks,vs) = read s in (BS.pack ks, BS.pack vs) It might be better to read the file in one go and construct the map in pure code (foldl' addv Map.empty $ lines filecontents). Also, it will probably be better to do everything on ByteStrings. The file format seems to be (key,value) on each line, with possible whitespace and empty lines. If none of the keys or values may contain a '\', undumpFile path = do contents - BS.readFile path return $! foldl' addv Map.empty (BS.lines contents) where addv m s | BS.null s = m | otherwise = case BS.split '' s of (_ : k : _ : v : _) - Map.insert k v m _ - error malformed line should perform much better. If a key or value may contain '', it's more complicated, using a regex library to split might be a good option then. dump :: [(BS.ByteString,BS.ByteString)] - IO () dump vs = mapM_ putV vs where putV (k,v) = putStrLn (show (BS.unpack k, BS.unpack v)) main :: IO () main = do args - getArgs case args of [path] - do v - undumpFile path
Re: memory slop (was: Using the GHC heap profiler)
Hi Tim, Sorry I can't tell you more about slop (I know less than you at this point), but I do see the problem. You're reading each line from a Handle as a String (bad), then creating ByteStrings from that string with BS.pack (really bad). You want to read a ByteString (or Data.Text, or other compact representation) directly from the handle without going through an intervening string format. Also, you'll be better off using a real parser instead of read, which is very difficult to use robustly. John L. From: Tim Docker t...@dockerz.net Subject: memory slop (was: Using the GHC heap profiler) To: glasgow-haskell-users@haskell.org Message-ID: 4d895bb0.1080...@dockerz.net Content-Type: text/plain; charset=ISO-8859-1; format=flowed On Mon, Mar 21, 2011 at 9:59 AM, I wrote: My question on the ghc heap profiler on stack overflow: http://stackoverflow.com/questions/5306717/how-should-i-interpret-the-output-of-the-ghc-heap-profiler remains unanswered :-( Perhaps that's not the best forum. Is there someone here prepared to explain how the memory usage in the heap profiler relates to the Live Bytes count shown in the garbage collection statistics? I've made a little progress on this. I've simplified my program down to a simple executable that loads a bunch of data into an in-memory map, and then writes it out again. I've added calls to `seq` to ensure that laziness is not causing excessing memory consumption. When I run this on my sample data set, it takes ~7 cpu seconds, and uses ~120 MB of vm An equivalent python script, takes ~2 secs and ~19MB of vm :-(. The code is below. I'm mostly concerned with the memory usage rather than performance at this stage. What is interesting, is that when I turn on garbage collection statistics (+RTS -s), I see this: 10,089,324,996 bytes allocated in the heap 201,018,116 bytes copied during GC 12,153,592 bytes maximum residency (8 sample(s)) 59,325,408 bytes maximum slop 114 MB total memory in use (1 MB lost due to fragmentation) Generation 0: 19226 collections, 0 parallel, 1.59s, 1.64selapsed Generation 1: 8 collections, 0 parallel, 0.04s, 0.04selapsed INIT time0.00s ( 0.00s elapsed) MUT time5.84s ( 5.96s elapsed) GCtime1.63s ( 1.68s elapsed) EXIT time0.00s ( 0.00s elapsed) Total time7.47s ( 7.64s elapsed) %GC time 21.8% (22.0% elapsed) Alloc rate1,726,702,840 bytes per MUT second Productivity 78.2% of total user, 76.5% of total elapsed This seems strange. The maximum residency of 12MB sounds about correct for my data. But what's with the 59MB of slop? According to the ghc docs: | The bytes maximum slop tells you the most space that is ever wasted | due to the way GHC allocates memory in blocks. Slop is memory at the | end of a block that was wasted. There's no way to control this; we | just like to see how much memory is being lost this way. There's this page also: http://hackage.haskell.org/trac/ghc/wiki/Commentary/Rts/Storage/Slop but it doesn't really make things clearer for me. Is the slop number above likely to be a significant contribution to net memory usage? Are there any obvious reasons why the code below could be generating so much? The data file in question has 61k lines, and is 6MB in total. Thanks, Tim Map2.hs module Main where import qualified Data.Map as Map import qualified Data.ByteString.Char8 as BS import System.Environment import System.IO type MyMap = Map.Map BS.ByteString BS.ByteString foldLines :: (a - String - a) - a - Handle - IO a foldLines f a h = do eof - hIsEOF h if eof then (return a) else do l - hGetLine h let a' = f a l a' `seq` foldLines f a' h undumpFile :: FilePath - IO MyMap undumpFile path = do h - openFile path ReadMode m - foldLines addv Map.empty h hClose h return m where addv m = m addv m s = let (k,v) = readKV s in k `seq` v `seq` Map.insert k v m readKV s = let (ks,vs) = read s in (BS.pack ks, BS.pack vs) dump :: [(BS.ByteString,BS.ByteString)] - IO () dump vs = mapM_ putV vs where putV (k,v) = putStrLn (show (BS.unpack k, BS.unpack v)) main :: IO () main = do args - getArgs case args of [path] - do v - undumpFile path dump (Map.toList v) return () ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: memory slop (was: Using the GHC heap profiler)
On Wed, Mar 23, 2011 at 9:32 AM, Tim Docker t...@dockerz.net wrote: Productivity 78.2% of total user, 76.5% of total elapsed As a rule of thumb GC time should be less than 10%. This seems strange. The maximum residency of 12MB sounds about correct for my data. But what's with the 59MB of slop? According to the ghc docs: | The bytes maximum slop tells you the most space that is ever wasted | due to the way GHC allocates memory in blocks. Slop is memory at the | end of a block that was wasted. There's no way to control this; we | just like to see how much memory is being lost this way. GHC requests memory from the OS in large blocks. This makes GC more efficient. The program might not end up using all the allocated memory in the end. type MyMap = Map.Map BS.ByteString BS.ByteString Try using HashMap from the unordered-collections package. It's typically 2-3x faster than Map for ByteString/Text keys. foldLines :: (a - String - a) - a - Handle - IO a foldLines f a h = do eof - hIsEOF h if eof then (return a) else do l - hGetLine h let a' = f a l a' `seq` foldLines f a' h Your foldLines is not strict enough. Consider what happens if you call foldLines someF undefined someHandle when the file is empty. If foldLines was strict in the accumulator you'd expect the program to crash (from evaluating undefined), but it doesn't as 'return a' doesn't force 'a'. In addition, you'd like GHC to inline foldLines so the indirect function call to 'f' can be turned to a call to a known function. Here's a better definition: foldLines :: (a - String - a) - a - Handle - IO a foldLines f a0 !h = go a0 where go !a = do eof - hIsEOF h if eof then (return a) else do l - hGetLine h go (f a l) {-# INLINE foldLines #-} Also, as others have mentioned, String is no good. Use ByteString and Text. Both come with functions to read lines (if I recall correctly). Johan ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: memory slop (was: Using the GHC heap profiler)
Minor update, here's how I would handle this problem (using uu-parsinglib and the latest ListLike, mostly untested): import Data.ListLike (fromString, CharString (..)) import Text.ParserCombinators.UU import Text.ParserCombinators.UU.BasicInstances import Text.ParserCombinators.UU.Utils -- change the local bindings in undumpFile to: addv m s | BS.null s = m addv m s = let (k,v) = readKV s in Map.insert k v m readKV :: BS.ByteString - (BS.ByteString, BS.ByteString) readKV s = let [ks,vs] = parse (pTuple [pQuotedString, pQuotedString]) (createStr (LineColPos 0 0 0) $ CS s) unCSf = BS.drop 1 . BS.init . unCS in (unCSf ks, unCSf vs) And of course change the type of foldLines and use BS.hGetLine, both to enable ByteString IO. To use uu-parsinglib's character parsers (e.g. pTuple) with ByteStrings, you need to use a newtype wrapper such as CharString from ListLike, CS and unCS wrap and unwrap the type. The unCSf function removes the starting and trailing quotes in addition to unwrapping. This is still quick-and-dirty in that there's no error recovery, but it's easy to add, just see the uu-parsinglib documentation and examples, particularly pEnd. I think this will make a significant difference to your application. John L. Message: 4 Date: Tue, 22 Mar 2011 20:32:16 -0600 From: Tim Docker t...@dockerz.net Subject: memory slop (was: Using the GHC heap profiler) To: glasgow-haskell-users@haskell.org Message-ID: 4d895bb0.1080...@dockerz.net Content-Type: text/plain; charset=ISO-8859-1; format=flowed Map2.hs module Main where import qualified Data.Map as Map import qualified Data.ByteString.Char8 as BS import System.Environment import System.IO type MyMap = Map.Map BS.ByteString BS.ByteString foldLines :: (a - String - a) - a - Handle - IO a foldLines f a h = do eof - hIsEOF h if eof then (return a) else do l - hGetLine h let a' = f a l a' `seq` foldLines f a' h undumpFile :: FilePath - IO MyMap undumpFile path = do h - openFile path ReadMode m - foldLines addv Map.empty h hClose h return m where addv m = m addv m s = let (k,v) = readKV s in k `seq` v `seq` Map.insert k v m readKV s = let (ks,vs) = read s in (BS.pack ks, BS.pack vs) dump :: [(BS.ByteString,BS.ByteString)] - IO () dump vs = mapM_ putV vs where putV (k,v) = putStrLn (show (BS.unpack k, BS.unpack v)) main :: IO () main = do args - getArgs case args of [path] - do v - undumpFile path dump (Map.toList v) return () ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users