I've been using ghc-7.1.20110125 and it does indeed help a great deal. I've tried compiling several problematic functions and in most cases the problem is gone. However, in one of my test cases the closures are still being constructed:
let { lvl8_s1S8 :: [Data.Iteratee.Base.Iteratee s_aXZ m_aXY a_aY1] -> Data.Iteratee.Base.Iteratee s_aXZ m_aXY () [LclId, Str=DmdType] lvl8_s1S8 = EnumSequenceTest.enumSequence_ @ m_aXY @ s_aXZ @ el_aY0 @ a_aY1 $dMonad_s1Q4 $dListLike_s1Q0 $dNullable_s1PV } in The code for enumSequence_ is included in the attachment. There are two versions: * enumSequence-bad.hs is the original code * enumSequence-good.hs includes a fix suggested by Akio When compiled with ghc-7.1.20110125 and -O2 [1] it uses a lot of memory: ./enumSequence-main +RTS -s 1 100000000 0 22,787,568,816 bytes allocated in the heap 1,455,499,400 bytes copied during GC 260,651,512 bytes maximum residency (10 sample(s)) 14,457,544 bytes maximum slop 530 MB total memory in use (0 MB lost due to fragmentation) Generation 0: 43936 collections, 0 parallel, 1.58s, 1.57s elapsed Generation 1: 10 collections, 0 parallel, 1.05s, 1.05s elapsed INIT time 0.00s ( 0.00s elapsed) MUT time 4.64s ( 4.66s elapsed) GC time 2.63s ( 2.62s elapsed) EXIT time 0.00s ( 0.00s elapsed) Total time 7.27s ( 7.28s elapsed) %GC time 36.1% (36.0% elapsed) Alloc rate 4,905,159,063 bytes per MUT second Productivity 63.8% of total user, 63.8% of total elapsed while compiling with -O2 and -fno-full-laziness or -O0 reverts memory usage back to constant: ./enumSequence-main +RTS -s 1 100000000 0 22,493,819,416 bytes allocated in the heap 578,891,112 bytes copied during GC 46,696 bytes maximum residency (1 sample(s)) 19,928 bytes maximum slop 1 MB total memory in use (0 MB lost due to fragmentation) Generation 0: 43335 collections, 0 parallel, 1.07s, 1.07s elapsed Generation 1: 1 collections, 0 parallel, 0.00s, 0.00s elapsed INIT time 0.00s ( 0.00s elapsed) MUT time 4.53s ( 4.55s elapsed) GC time 1.07s ( 1.07s elapsed) EXIT time 0.00s ( 0.00s elapsed) Total time 5.60s ( 5.62s elapsed) %GC time 19.2% (19.0% elapsed) Alloc rate 4,966,355,161 bytes per MUT second Productivity 80.8% of total user, 80.5% of total elapsed -- Maciej [1] ghc --make -rtsopts -fforce-recomp -O2 enumSequence-bad.hs enumSequence-main.hs [2] ghc --make -rtsopts -fforce-recomp -O2 -fno-full-laziness enumSequence-bad.hs enumSequence-main.hs On Thu, Feb 10, 2011 at 2:00 AM, Simon Peyton-Jones <simo...@microsoft.com> wrote: > In general it's quite hard to solve this problem without risking losing > sharing. > > However in this case I added a simple arity analyser after the 7.0.1 release > which solves the problem. It'll be in 7.0.2. > > Try with HEAD and check it does what you expect. > > Simon > > | -----Original Message----- > | From: glasgow-haskell-users-boun...@haskell.org [mailto:glasgow-haskell- > | users-boun...@haskell.org] On Behalf Of Akio Takano > | Sent: 07 February 2011 04:10 > | To: glasgow-haskell-users@haskell.org > | Subject: Dictionaries and full laziness transformation > | > | Hi, > | > | I'm using GHC 7.0.1. I found that recursive overloaded functions tend > | to leak memory when compiled with full-laziness optimization on. Here > | is a simple case. > | > | -- TestSub.hs > | {-# LANGUAGE BangPatterns #-} > | module TestSub where > | > | {-# NOINLINE factorial #-} > | factorial :: (Num a) => a -> a -> a > | factorial !n !acc = if n == 0 then acc else factorial (n - 1) (acc * n) > | > | -- main.hs > | import TestSub > | > | factorial1 :: Int -> Int -> Int > | factorial1 = factorial > | > | main = do > | n <- readLn > | print $ factorial1 n 1 > | > | main > | > | This program should run in constant space, and compiled with -O0 or > | -O2 -fno-full-laziness, it does. However with -O2, it takes a linear > | amount of memory. The core for factorial looks like this: > | > | TestSub.factorial = > | \ (@ a_ajm) ($dNum_slz :: GHC.Num.Num a_ajm) -> > | let { > | a_slA :: GHC.Classes.Eq a_ajm > | [LclId] > | a_slA = GHC.Num.$p1Num @ a_ajm $dNum_slz } in > | let { > | lvl2_slC :: a_ajm -> a_ajm -> a_ajm > | [LclId] > | lvl2_slC = TestSub.factorial @ a_ajm $dNum_slz } in > | ... > | > | The problem is that lvl2_slC closure is created whenever factorial is > | applied to a Num dictionary, and kept alive until that application is > | GCed. In this program it never happens, because an application to the > | Num Int dictionary is referred to by the factorial1 CAF, and it > | recursively keeps the whole chain of closures alive. > | > | I know that full laziness transformation *sometimes* causes a space > | leak, but this looks like a bad result to me, because: > | > | - It looks like there is no point building a lot of equivalent > | closures, instead of one. > | - A lot of code can suffer from this behavior, because overloaded > | recursive functions are fairly common. > | For example, unfoldConvStream function from the latest iteratee > | package suffers from this problem, if I understand correctly. > | > | Does anyone have an idea on whether this can be fixed in GHC, or how > | to work around this problem? > | > | Regards, > | > | Takano Akio > | > | _______________________________________________ > | Glasgow-haskell-users mailing list > | Glasgow-haskell-users@haskell.org > | http://www.haskell.org/mailman/listinfo/glasgow-haskell-users > > > _______________________________________________ > Glasgow-haskell-users mailing list > Glasgow-haskell-users@haskell.org > http://www.haskell.org/mailman/listinfo/glasgow-haskell-users >
import Data.Iteratee import Control.Monad.IO.Class import qualified Data.Iteratee.ListLike as I import qualified Data.Iteratee.Iteratee as I import EnumSequenceTest main = do run $ iter [1..100000000] run $ iter [] where iter :: [Int] -> Iteratee [Int] IO () iter xs = joinIM $ enumPureNChunk xs 128 $ enumSequence_ [I.length >>= showIter, I.head >>= showIter] showIter :: Int -> Iteratee [Int] IO () showIter = liftIO . putStrLn . show
{-# LANGUAGE ScopedTypeVariables #-} {-# LANGUAGE RankNTypes #-} {-# LANGUAGE FlexibleContexts #-} module EnumSequenceTest where import Data.Iteratee.Iteratee import qualified Data.ListLike as LL import Control.Monad (liftM) import Control.Monad.Trans (lift) import Data.List (foldl1) import Data.Monoid (<>) :: Monoid a => a -> a -> a (<>) = mappend enumSequence_ :: forall m s el a . (Monad m, LL.ListLike s el, Nullable s) => [Iteratee s m a] -> Iteratee s m () enumSequence_ = self where self is = liftI step where step :: Stream s -> Iteratee s m () step s@(Chunk xs) | LL.null xs = liftI step step s@(Chunk xs) = do -- collect iteratees which are not yet "done" is' <- lift $ liftM (map snd . filter (not . fst)) . mapM enumCheckIfDone $ is if Prelude.null is' then idone () . (<> s) =<< lift (remainingStream is) else self . map (joinIM . enumPure1Chunk xs) $ is' step s@(EOF _) = idone () . (<> s) =<< (lift . remainingStream . map (joinIM . enumChunk s) $ is) -- returns the unconsumed part of the stream; "enumSequence_ is" consumes as -- much of the stream as the iteratee in is that consumes the most; e.g. -- enumSequence_ [I.head, I.last] consumes whole stream remainingStream :: [Iteratee s m a] -> m (Stream s) remainingStream = liftM (foldl1 shortest) . mapM (\i -> runIter i onDone onCont) where onDone _ s = return s onCont _ Nothing = return mempty onCont _ e = return (EOF e) shortest :: Stream s -> Stream s -> Stream s shortest (EOF _) s = s shortest s (EOF _) = s shortest (Chunk xs) (Chunk ys) | LL.length xs > LL.length ys = Chunk ys | LL.length xs < LL.length ys = Chunk xs | otherwise = Chunk xs
{-# LANGUAGE ScopedTypeVariables #-} {-# LANGUAGE RankNTypes #-} {-# LANGUAGE FlexibleContexts #-} module EnumSequenceTest where import Data.Iteratee.Iteratee import qualified Data.ListLike as LL import Control.Monad (liftM) import Control.Monad.Trans (lift) import Data.List (foldl1) import Data.Monoid (<>) :: Monoid a => a -> a -> a (<>) = mappend enumSequence_ :: forall m s el a . (Monad m, LL.ListLike s el, Nullable s) => [Iteratee s m a] -> Iteratee s m () enumSequence_ is = liftI step where step :: Stream s -> Iteratee s m () step s@(Chunk xs) | LL.null xs = liftI step step s@(Chunk xs) = do -- collect iteratees which are not yet "done" is' <- lift $ liftM (map snd . filter (not . fst)) . mapM enumCheckIfDone $ is if Prelude.null is' then idone () . (<> s) =<< lift (remainingStream is) else enumSequence_ . map (joinIM . enumPure1Chunk xs) $ is' step s@(EOF _) = idone () . (<> s) =<< (lift . remainingStream . map (joinIM . enumChunk s) $ is) -- returns the unconsumed part of the stream; "enumSequence_ is" consumes as -- much of the stream as the iteratee in is that consumes the most; e.g. -- enumSequence_ [I.head, I.last] consumes whole stream remainingStream :: [Iteratee s m a] -> m (Stream s) remainingStream = liftM (foldl1 shortest) . mapM (\i -> runIter i onDone onCont) where onDone _ s = return s onCont _ Nothing = return mempty onCont _ e = return (EOF e) shortest :: Stream s -> Stream s -> Stream s shortest (EOF _) s = s shortest s (EOF _) = s shortest (Chunk xs) (Chunk ys) | LL.length xs > LL.length ys = Chunk ys | LL.length xs < LL.length ys = Chunk xs | otherwise = Chunk xs
_______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users