(Sorry for the long email.) Summary: why does the attached program have non-constant memory use?
==== Introduction ==== I've written a program to do a big computation. Unfortunately, the computation takes a very long time (expectedly), and the memory use increases slowly (unexpectedly), until it fills up the entire memory and swap space of the computer (many gigabytes). The rough structure of the program is: • create a long (up to 20 million) list of objects; • compute a number for each of those objects; • compute the sum of the resulting list. I switched the intermediate data structure from a list to a Stream (from the stream-fusion package), hoping to fix the memory issue. It decreased both the memory use and the rate of its increase, but after a long time, the program still uses up all available memory. ==== A simple program After many hours of cutting down my program, I now have a small program (attached) that shows the same behaviour. It uses only said stream-fusion package, and vector. (I haven't yet tried to cut out the use of vector. I hope it is irrelevant, because all vectors are of fixed small size.) I compile the program with ghc-7.6.1 using > ghc --make -threaded -rtsopts -with-rtsopts="-M1G -K128M" -O2 -main-is > Test.main Test The rts options may not be strictly necessary: I added them at some point to allow the use of multiple cores, and to prevent the program from crashing the machine by using all available memory. When running the program, the resident memory quickly grows to about 3.5 MB (which I am fine with); this stays constant for a long time, but after about 7 minutes, it starts to grow further. The growth is slow, but I really would hope this program to run in constant memory. ==== The code ==== Note that I added an instance for Monad Stream, using concatMap. This is implicitly used in the definition of the big stream. The source of Data.Stream contains many alternative implementations of concat and concatMap, and alludes to the difficulty of making it fuse properly. Could it be that the fusion did not succeed in this case? Thanks for any help! Regards, Arie
module Test where import Control.Applicative ((<$>)) import qualified Data.Stream as S import Data.Stream (Stream) import qualified Data.Vector as V -- Vector stuff type V = V.Vector Integer -- Inner product. inp :: V -> V -> Integer inp a b = V.foldl' (+) 0 (V.zipWith (*) a b) -- Stream stuff instance Monad Stream where return = S.stream . return (>>=) = flip S.concatMap -- Generating vectors small :: Stream V small = go  8 where go :: [Integer] -> Integer -> Stream V go xs 1 = return $ V.fromList xs go xs n = do x <- S.stream [-1,0,0,1] go (x : xs) (n - 1) big :: Stream (V,V) big = do v <- small w <- small return (v,w) -- Main program produce :: Stream Integer produce = S.filter (== 4) $ uncurry inp <$> big consume :: Stream Integer -> Integer consume = S.foldl' (+) 0 main :: IO () main = print (consume produce)
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe