A problem with GHC? =================== This post describes some odd behaviour I have seen in GHC 6.12.1 when writing Combinatorrent. The post is literate Haskell so you can run it. The executive summary: A space leak occurs when a new process is spawned from inside another process - and I can't figure out why. I am asking for help on haskell-cafe.
We begin by upgrading GHC from Haskell98 to something mature and modern: > {-# LANGUAGE GeneralizedNewtypeDeriving #-} And then we do the import-tango: > module Main where > > import Control.Monad.Reader > import Control.Monad.State > > In our system, a *Process* is an identifier for a thread of execution. Since these heavily makes use of the following imports, > import Control.Concurrent > import Control.Concurrent.STM it is beneficial to make processes live in the IO-monad. However, it turns out that having everything living in the IO monad is a clunky idea. A Process also has a current configuration: The channels and variables on which it can communicate for instance. The configuration is not expected to be changed over the course of the process running. For data which do change when the process runs, we want some state tracking. Luckily, the Xmonad X-monad comes to our rescue. In all its gory and glory details, we generalize it and rewrite it for our processes, which is how they are used in combinatorrent. Our model-kit for building new threads of execution is the following beast: > newtype Process a b c = Process (ReaderT a (StateT b IO) c) > deriving (Functor, Monad, MonadIO, MonadState b, MonadReader a) Note that the automatic derivations of *MonadState b* and *MonadReader a* makes GHC spit our some mkUsageInfo warnings in its generation of the .hi-files. They don't seem to be dangerous. Glueing instructions for our model kit is given by spawning off new threads: > run :: a -> b -> Process a b c -> IO (c, b) > run c st (Process p) = runStateT (runReaderT p c) st > > spawn :: a -> b -> Process a b () -> IO ThreadId > spawn c st p = forkIO $ run c st p >> return () Our first dummy --------------- Machinery for having fun is now in place. Here is a crash-test-dummy we would like to play with: > main1 = do > spawn () () (forever $ return ()) > threadDelay (3 * 1000000) > > -- main = main1 Note that the given units () and () are usually much more complicated, but for the sake of this minimal example units will do. And running that does what we expect it to do: ./Post +RTS -tstderr <<ghc: 1156600984 bytes, 2207 GCs, 2668/2668 avg/max bytes residency (1 samples), 1M in use, 0.00 INIT (0.00 elapsed), 2.72 MUT (2.67 elapsed), 0.28 GC (0.34 elapsed) :ghc>> At this point, we got our own little processor-heater installed in our room. But spring is coming to the Northern hemisphere, so we have little use for this until November or December. To actually make our system do something, we build this gem: > p1 :: Process () () () > p1 = forever $ return () > > startp1 :: IO ThreadId > startp1 = spawn () () p1 > > startp2 :: IO ThreadId > startp2 = spawn () () (forever $ > do liftIO startp1 > liftIO $ putStrLn "Delaying" > liftIO $ threadDelay (10 * 1000000)) > > main2 = do > putStrLn "Main thread starting" > startp2 > threadDelay (1 * 1000000) > > main = main2 Running *this* beast gives: ./Post +RTS -tstderr Main thread starting Delaying <<ghc: 2532328160 bytes, 4831 GCs, 63629171/319163472 avg/max bytes residency (10 samples), 629M in use, 0.00 INIT (0.00 elapsed), 1.02 MUT (1.09 elapsed), 1.57 GC (1.91 elapsed) :ghc>> Notice the improvement! Now we both get to burn the CPU and we have a space leak! Removing the "forever" in startp2 will make this code not leak space. startp2 only spawns a single process, which then runs and each time it takes a cycle in the forever-loop, it builds up live heap. This boggles my mind. Why does this code leak like mad? What could one do to remedy the leak? The leak is PAP and GHC.Base.sat_... (according to -hd heap profiling). In my combinatorrent code it is a ->StateT (does anyone know what PAP and ->StateT means and where to find more definitions like these?). In confusion, -- J. _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe