memory slop (was: Using the GHC heap profiler)

2011-03-22 Thread Tim Docker


On Mon, Mar 21, 2011 at 9:59 AM, I wrote:


My question on the ghc heap profiler on stack overflow:

http://stackoverflow.com/questions/5306717/how-should-i-interpret-the-output-of-the-ghc-heap-profiler

remains unanswered :-( Perhaps that's not the best forum. Is there someone
here prepared to explain how the memory usage in the heap profiler relates
to the  Live Bytes count shown in the garbage collection statistics?


I've made a little progress on this. I've simplified my program down to 
a simple executable that loads a bunch of data into an in-memory map, 
and then writes it out again. I've added calls to `seq` to ensure that 
laziness is not causing excessing memory consumption. When I run this on 
my sample data set, it takes ~7 cpu seconds, and uses ~120 MB of vm An 
equivalent python script, takes ~2 secs and ~19MB of vm :-(.


The code is below. I'm mostly concerned with the memory usage rather 
than performance at this stage. What is interesting, is that when I turn 
on garbage collection statistics (+RTS -s), I see this:


  10,089,324,996 bytes allocated in the heap
 201,018,116 bytes copied during GC
  12,153,592 bytes maximum residency (8 sample(s))
  59,325,408 bytes maximum slop
 114 MB total memory in use (1 MB lost due to fragmentation)

  Generation 0: 19226 collections, 0 parallel,  1.59s,  1.64selapsed
  Generation 1: 8 collections, 0 parallel,  0.04s,  0.04selapsed

  INIT  time0.00s  (  0.00s elapsed)
  MUT   time5.84s  (  5.96s elapsed)
  GCtime1.63s  (  1.68s elapsed)
  EXIT  time0.00s  (  0.00s elapsed)
  Total time7.47s  (  7.64s elapsed)

  %GC time  21.8%  (22.0% elapsed)

  Alloc rate1,726,702,840 bytes per MUT second

  Productivity  78.2% of total user, 76.5% of total elapsed

This seems strange. The maximum residency of 12MB sounds about correct 
for my data. But what's with the 59MB of slop? According to the ghc docs:


| The bytes maximum slop tells you the most space that is ever wasted
| due to the way GHC allocates memory in blocks. Slop is memory at the
| end of a block that was wasted. There's no way to control this; we
| just like to see how much memory is being lost this way.

There's this page also:

http://hackage.haskell.org/trac/ghc/wiki/Commentary/Rts/Storage/Slop

but it doesn't really make things clearer for me.

Is the slop number above likely to be a significant contribution to net 
memory usage? Are there any obvious reasons why the code below could be 
generating so much? The data file in question has 61k lines, and is 6MB 
in total.


Thanks,

Tim

 Map2.hs 

module Main where

import qualified Data.Map as Map
import qualified Data.ByteString.Char8 as BS
import System.Environment
import System.IO

type MyMap = Map.Map BS.ByteString BS.ByteString

foldLines :: (a - String - a) - a - Handle - IO a
foldLines f a h = do
eof - hIsEOF h
if eof
  then (return a)
  else do
 l - hGetLine h
 let a' = f a l
 a' `seq` foldLines f a' h

undumpFile :: FilePath - IO MyMap
undumpFile path = do
h - openFile path ReadMode
m - foldLines addv Map.empty h
hClose h
return m
  where
addv m  = m
addv m s = let (k,v) = readKV s
   in k `seq` v `seq` Map.insert k v m

readKV s = let (ks,vs) = read s in (BS.pack ks, BS.pack vs)

dump :: [(BS.ByteString,BS.ByteString)] - IO ()
dump vs = mapM_ putV vs
  where
putV (k,v) = putStrLn (show (BS.unpack k, BS.unpack v))

main :: IO ()
main =  do
args - getArgs
case args of
  [path] - do
  v - undumpFile path
  dump (Map.toList v)
  return ()








___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: memory slop (was: Using the GHC heap profiler)

2011-03-22 Thread Daniel Fischer
On Wednesday 23 March 2011 03:32:16, Tim Docker wrote:
 On Mon, Mar 21, 2011 at 9:59 AM, I wrote:
  My question on the ghc heap profiler on stack overflow:
  
  http://stackoverflow.com/questions/5306717/how-should-i-interpret-the-
  output-of-the-ghc-heap-profiler
  
  remains unanswered :-( Perhaps that's not the best forum. Is there
  someone here prepared to explain how the memory usage in the heap
  profiler relates to the  Live Bytes count shown in the garbage
  collection statistics?
 
 I've made a little progress on this. I've simplified my program down to
 a simple executable that loads a bunch of data into an in-memory map,
 and then writes it out again. I've added calls to `seq` to ensure that
 laziness is not causing excessing memory consumption. When I run this on
 my sample data set, it takes ~7 cpu seconds, and uses ~120 MB of vm An
 equivalent python script, takes ~2 secs and ~19MB of vm :-(.
 
 The code is below. I'm mostly concerned with the memory usage rather
 than performance at this stage. What is interesting, is that when I turn
 on garbage collection statistics (+RTS -s), I see this:
 
10,089,324,996 bytes allocated in the heap
   201,018,116 bytes copied during GC
12,153,592 bytes maximum residency (8 sample(s))
59,325,408 bytes maximum slop
   114 MB total memory in use (1 MB lost due to
 fragmentation)
 
Generation 0: 19226 collections, 0 parallel,  1.59s, 
 1.64selapsed Generation 1: 8 collections, 0 parallel,  0.04s, 
 0.04selapsed
 
INIT  time0.00s  (  0.00s elapsed)
MUT   time5.84s  (  5.96s elapsed)
GCtime1.63s  (  1.68s elapsed)
EXIT  time0.00s  (  0.00s elapsed)
Total time7.47s  (  7.64s elapsed)
 
%GC time  21.8%  (22.0% elapsed)
 
Alloc rate1,726,702,840 bytes per MUT second
 
Productivity  78.2% of total user, 76.5% of total elapsed
 
 This seems strange. The maximum residency of 12MB sounds about correct
 
 for my data. But what's with the 59MB of slop? According to the ghc 
docs:
 | The bytes maximum slop tells you the most space that is ever wasted
 | due to the way GHC allocates memory in blocks. Slop is memory at the
 | end of a block that was wasted. There's no way to control this; we
 | just like to see how much memory is being lost this way.
 
 There's this page also:
 
 http://hackage.haskell.org/trac/ghc/wiki/Commentary/Rts/Storage/Slop
 
 but it doesn't really make things clearer for me.
 
 Is the slop number above likely to be a significant contribution to net
 memory usage?

Yes, absolutely.

 Are there any obvious reasons why the code below could be
 generating so much?

I suspect packing a lot of presumably relatively short ByteStrings would 
generate (the lion's share of) the slop. I'm not familiar with the 
internals, though, so I don't know where GHC would put a 
newPinnedByteArray# (which is where your ByteString contents is), what 
alignement requirements those have.

 The data file in question has 61k lines, and is 6MB
 in total.
 
 Thanks,
 
 Tim
 
  Map2.hs 
 
 module Main where
 
 import qualified Data.Map as Map
 import qualified Data.ByteString.Char8 as BS
 import System.Environment
 import System.IO
 
 type MyMap = Map.Map BS.ByteString BS.ByteString
 
 foldLines :: (a - String - a) - a - Handle - IO a
 foldLines f a h = do
  eof - hIsEOF h
  if eof
then (return a)
else do
   l - hGetLine h
   let a' = f a l
   a' `seq` foldLines f a' h
 
 undumpFile :: FilePath - IO MyMap
 undumpFile path = do
  h - openFile path ReadMode
  m - foldLines addv Map.empty h
  hClose h
  return m
where
  addv m  = m
  addv m s = let (k,v) = readKV s
 in k `seq` v `seq` Map.insert k v m
 
  readKV s = let (ks,vs) = read s in (BS.pack ks, BS.pack vs)

It might be better to read the file in one go and construct the map in pure 
code (foldl' addv Map.empty $ lines filecontents).
Also, it will probably be better to do everything on ByteStrings.
The file format seems to be
(key,value)
on each line, with possible whitespace and empty lines.
If none of the keys or values may contain a '\',

undumpFile path = do
contents - BS.readFile path
return $! foldl' addv Map.empty (BS.lines contents)
  where
addv m s
  | BS.null s = m
  | otherwise = case BS.split '' s of
  (_ : k : _ : v : _) - Map.insert k v m
  _ - error malformed line

should perform much better.
If a key or value may contain '', it's more complicated, using a regex 
library to split might be a good option then.

 
 dump :: [(BS.ByteString,BS.ByteString)] - IO ()
 dump vs = mapM_ putV vs
where
  putV (k,v) = putStrLn (show (BS.unpack k, BS.unpack v))
 
 main :: IO ()
 main =  do
  args - getArgs
  case args of
[path] - do
v - undumpFile path

Re: memory slop (was: Using the GHC heap profiler)

2011-03-22 Thread John Lato
Hi Tim,

Sorry I can't tell you more about slop (I know less than you at this point),
but I do see the problem.  You're reading each line from a Handle as a
String (bad), then creating ByteStrings from that string with BS.pack
(really bad).  You want to read a ByteString (or Data.Text, or other compact
representation) directly from the handle without going through an
intervening string format.  Also, you'll be better off using a real parser
instead of read, which is very difficult to use robustly.

John L.


 From: Tim Docker t...@dockerz.net
 Subject: memory slop (was: Using the GHC heap profiler)
 To: glasgow-haskell-users@haskell.org
 Message-ID: 4d895bb0.1080...@dockerz.net
 Content-Type: text/plain; charset=ISO-8859-1; format=flowed


 On Mon, Mar 21, 2011 at 9:59 AM, I wrote:
 
  My question on the ghc heap profiler on stack overflow:
 
 
 http://stackoverflow.com/questions/5306717/how-should-i-interpret-the-output-of-the-ghc-heap-profiler
 
  remains unanswered :-( Perhaps that's not the best forum. Is there
 someone
  here prepared to explain how the memory usage in the heap profiler
 relates
  to the  Live Bytes count shown in the garbage collection statistics?

 I've made a little progress on this. I've simplified my program down to
 a simple executable that loads a bunch of data into an in-memory map,
 and then writes it out again. I've added calls to `seq` to ensure that
 laziness is not causing excessing memory consumption. When I run this on
 my sample data set, it takes ~7 cpu seconds, and uses ~120 MB of vm An
 equivalent python script, takes ~2 secs and ~19MB of vm :-(.

 The code is below. I'm mostly concerned with the memory usage rather
 than performance at this stage. What is interesting, is that when I turn
 on garbage collection statistics (+RTS -s), I see this:

   10,089,324,996 bytes allocated in the heap
  201,018,116 bytes copied during GC
   12,153,592 bytes maximum residency (8 sample(s))
   59,325,408 bytes maximum slop
  114 MB total memory in use (1 MB lost due to fragmentation)

   Generation 0: 19226 collections, 0 parallel,  1.59s,  1.64selapsed
   Generation 1: 8 collections, 0 parallel,  0.04s,  0.04selapsed

   INIT  time0.00s  (  0.00s elapsed)
   MUT   time5.84s  (  5.96s elapsed)
   GCtime1.63s  (  1.68s elapsed)
   EXIT  time0.00s  (  0.00s elapsed)
   Total time7.47s  (  7.64s elapsed)

   %GC time  21.8%  (22.0% elapsed)

   Alloc rate1,726,702,840 bytes per MUT second

   Productivity  78.2% of total user, 76.5% of total elapsed

 This seems strange. The maximum residency of 12MB sounds about correct
 for my data. But what's with the 59MB of slop? According to the ghc docs:

 | The bytes maximum slop tells you the most space that is ever wasted
 | due to the way GHC allocates memory in blocks. Slop is memory at the
 | end of a block that was wasted. There's no way to control this; we
 | just like to see how much memory is being lost this way.

 There's this page also:

 http://hackage.haskell.org/trac/ghc/wiki/Commentary/Rts/Storage/Slop

 but it doesn't really make things clearer for me.

 Is the slop number above likely to be a significant contribution to net
 memory usage? Are there any obvious reasons why the code below could be
 generating so much? The data file in question has 61k lines, and is 6MB
 in total.

 Thanks,

 Tim

  Map2.hs 

 module Main where

 import qualified Data.Map as Map
 import qualified Data.ByteString.Char8 as BS
 import System.Environment
 import System.IO

 type MyMap = Map.Map BS.ByteString BS.ByteString

 foldLines :: (a - String - a) - a - Handle - IO a
 foldLines f a h = do
 eof - hIsEOF h
 if eof
   then (return a)
   else do
  l - hGetLine h
  let a' = f a l
  a' `seq` foldLines f a' h

 undumpFile :: FilePath - IO MyMap
 undumpFile path = do
 h - openFile path ReadMode
 m - foldLines addv Map.empty h
 hClose h
 return m
   where
 addv m  = m
 addv m s = let (k,v) = readKV s
in k `seq` v `seq` Map.insert k v m

 readKV s = let (ks,vs) = read s in (BS.pack ks, BS.pack vs)

 dump :: [(BS.ByteString,BS.ByteString)] - IO ()
 dump vs = mapM_ putV vs
   where
 putV (k,v) = putStrLn (show (BS.unpack k, BS.unpack v))

 main :: IO ()
 main =  do
 args - getArgs
 case args of
   [path] - do
   v - undumpFile path
   dump (Map.toList v)
   return ()

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: memory slop (was: Using the GHC heap profiler)

2011-03-22 Thread Johan Tibell
On Wed, Mar 23, 2011 at 9:32 AM, Tim Docker t...@dockerz.net wrote:
  Productivity  78.2% of total user, 76.5% of total elapsed

As a rule of thumb GC time should be less than 10%.

 This seems strange. The maximum residency of 12MB sounds about correct for
 my data. But what's with the 59MB of slop? According to the ghc docs:

 | The bytes maximum slop tells you the most space that is ever wasted
 | due to the way GHC allocates memory in blocks. Slop is memory at the
 | end of a block that was wasted. There's no way to control this; we
 | just like to see how much memory is being lost this way.

GHC requests memory from the OS in large blocks. This makes GC more
efficient. The program might not end up using all the allocated memory
in the end.

 type MyMap = Map.Map BS.ByteString BS.ByteString

Try using HashMap from the unordered-collections package. It's
typically 2-3x faster than Map for ByteString/Text keys.

 foldLines :: (a - String - a) - a - Handle - IO a
 foldLines f a h = do
    eof - hIsEOF h
    if eof
      then (return a)
      else do
         l - hGetLine h
         let a' = f a l
         a' `seq` foldLines f a' h

Your foldLines is not strict enough. Consider what happens if you call

foldLines someF undefined someHandle

when the file is empty. If foldLines was strict in the accumulator
you'd expect the program to crash (from evaluating undefined), but it
doesn't as 'return a' doesn't force 'a'.

In addition, you'd like GHC to inline foldLines so the indirect
function call to 'f' can be turned to a call to a known function.
Here's a better definition:

foldLines :: (a - String - a) - a - Handle - IO a
foldLines f a0 !h = go a0
  where
go !a = do
  eof - hIsEOF h
  if eof
then (return a)
else do
  l - hGetLine h
  go (f a l)
{-# INLINE foldLines #-}

Also, as others have mentioned, String is no good. Use ByteString and
Text. Both come with functions to read lines (if I recall correctly).

Johan

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: memory slop (was: Using the GHC heap profiler)

2011-03-22 Thread John Lato
Minor update, here's how I would handle this problem (using uu-parsinglib
and the latest ListLike, mostly untested):


import Data.ListLike (fromString, CharString (..))
import Text.ParserCombinators.UU
import Text.ParserCombinators.UU.BasicInstances
import Text.ParserCombinators.UU.Utils

-- change the local bindings in undumpFile to:

addv m s | BS.null s = m
addv m s = let (k,v) = readKV s
   in Map.insert k v m
readKV :: BS.ByteString - (BS.ByteString, BS.ByteString)
readKV s = let [ks,vs] = parse (pTuple [pQuotedString, pQuotedString])
(createStr (LineColPos 0 0 0) $ CS s)
  unCSf = BS.drop 1 . BS.init . unCS
  in (unCSf ks, unCSf vs)


And of course change the type of foldLines and use
BS.hGetLine, both to enable ByteString IO.

To use uu-parsinglib's character parsers (e.g. pTuple) with ByteStrings, you
need to use a newtype wrapper such as CharString from ListLike, CS and
unCS wrap and unwrap the type.  The unCSf function removes the starting
and trailing quotes in addition to unwrapping.  This is still
quick-and-dirty in that there's no error recovery, but it's easy to add,
just see the uu-parsinglib documentation and examples, particularly pEnd.

I think this will make a significant difference to your application.

John L.

Message: 4

 Date: Tue, 22 Mar 2011 20:32:16 -0600
 From: Tim Docker t...@dockerz.net
 Subject: memory slop (was: Using the GHC heap profiler)
 To: glasgow-haskell-users@haskell.org
 Message-ID: 4d895bb0.1080...@dockerz.net
 Content-Type: text/plain; charset=ISO-8859-1; format=flowed

  Map2.hs 

 module Main where

 import qualified Data.Map as Map
 import qualified Data.ByteString.Char8 as BS
 import System.Environment
 import System.IO

 type MyMap = Map.Map BS.ByteString BS.ByteString

 foldLines :: (a - String - a) - a - Handle - IO a
 foldLines f a h = do
 eof - hIsEOF h
 if eof
   then (return a)
   else do
  l - hGetLine h
  let a' = f a l
  a' `seq` foldLines f a' h

 undumpFile :: FilePath - IO MyMap
 undumpFile path = do
 h - openFile path ReadMode
 m - foldLines addv Map.empty h
 hClose h
 return m
   where
 addv m  = m
 addv m s = let (k,v) = readKV s
in k `seq` v `seq` Map.insert k v m

 readKV s = let (ks,vs) = read s in (BS.pack ks, BS.pack vs)

 dump :: [(BS.ByteString,BS.ByteString)] - IO ()
 dump vs = mapM_ putV vs
   where
 putV (k,v) = putStrLn (show (BS.unpack k, BS.unpack v))

 main :: IO ()
 main =  do
 args - getArgs
 case args of
   [path] - do
   v - undumpFile path
   dump (Map.toList v)
   return ()

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users