Re: [Haskell-cafe] Re: Real-time garbage collection for Haskell
On 2010-03-06 12:42 + (Sat), Simon Marlow wrote: Usually I find keeping the nursery size (-A) close to the L2 cache size works best, although sometimes making it really big can be even better. Interesting to know. I got the impression that I was being encouraged to keep -A closer to the L1 cache size, myself. -qg disables parallel GC completely. This is usually terrible for locality, because every GC will move all the recently allocated data from each CPU's L2 cache into the cache of the CPU doing GC, where it will have to be fetched out again after GC. I've since explained to Cranshaw (we are getting to have *way* too many 'Simon's around here) about the issues with our different machines; some of this depends on the host on which we're doing the testing. * Our Core i7 hosts share 8 MB of L3 cache amongst four cores with two threads each. Thus, no locality penalties here. * Our Xeon E5420 host has two 4-core CPUs, and each pair of cores shares a 6 MB L2 cache. Thus there's a pretty good chance that something you need is in someone else's cache. I don't know if there's any difference between moving stuff between two caches on the same CPU and two caches on different CPUs. * Our Xeon E5520 host has two 4-core CPUs, each core of which has two threads. Each CPU (4 cores) shares an 8 MB L3 cache. Thus, presumably, less locality penalty than the E5420 but more than an i7. As a side note, I also see slightly less memory bandwidth on this system (for both caches and main memory) than I do on an i7. This gets complex pretty fast. And don't ask me about Intel's new style of having L1 and L3 or L2 and L3 caches rather than L1 and L2 caches. -qb disables load-balancing in the parallel GC, which improves locality at the expense of parallelism, usually I find it is an improvement in parallel programs. I'd think so too. Figuring out what went on here is going to have to wait until I get more detailed GC information in the eventlog. Followups to glasgow-haskell-us...@haskell.org. cjs -- Curt Sampson c...@cynic.net +81 90 7737 2974 http://www.starling-software.com The power of accurate observation is commonly called cynicism by those who have not got it.--George Bernard Shaw ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Real-time garbage collection for Haskell
On 2010-03-02 14:17 + (Tue), Simon Marlow wrote: System.Mem.performGC is your friend, but if you're unlucky it might do a major GC and then you'll get more pause than you bargained for. Anybody calling that is a really poor unlucky sod, because, as far as I can tell from reading the sources, System.Mem.performGC always does a major GC. cjs -- Curt Sampson c...@cynic.net +81 90 7737 2974 http://www.starling-software.com The power of accurate observation is commonly called cynicism by those who have not got it.--George Bernard Shaw ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Real-time garbage collection for Haskell
A fully concurrent GC running on multiple threads/cores might be great, but I guess this is difficult to implement and introduces a lot of overhead. For simple video games, it might work to always do a full GC per frame, but don't allow it to take more than T milliseconds. In a sense the GC function should be able to be paused and resumed, but it could run on the same thread as the game loop, so no synchronization overhead would arise (in a sense many video games are already little cooperative multitasking systems). The game loop should then decide how to balance the time given to the GC and the memory being collected. This would cause longer frame times and hence sometimes a decrease in frame rate, but never long stalls. Note that the issue also popped up for me in C many years ago. Using Watcom C/C++ in the nineties, I found out that a call to the free function could take a very long time. Also for games that could run many hours or days (e.g. a multi player game server) the C heap could get very fragmented, causing memory allocations and deallocations to take ever more time, sometimes even fail. To make my games run smoothly I had to implement my own memory manager. To make them run for a very long time, I had to implement a memory defragmenter. So in the end, this means a garbage collector :-) Of course this problem is well known, I just wanted to re-state that for games the typical C heap is not very well suitable either. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Real-time garbage collection for Haskell
Simon Marlow wrote: So it would be pretty easy to provide something like disableMajorGC, enableMajorGC :: IO () Of course leaving it disabled too long could be bad, but that's your responsibility. It seems like it'd be preferable to have an interface like: withMajorGCDisabled :: IO() - IO() or (for some definition of K): withMajorGCDisabled :: (K - IO()) - IO() in order to ensure that it always gets turned back on eventually. Of course, the latter can be created from the former pair. It's just that the former reminds me a bit much of explicit memory management and how difficult it is to balance the free()s... -- Live well, ~wren ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Real-time garbage collection for Haskell
On 2 Mar 2010, at 21:38, Simon Marlow wrote: On 02/03/10 20:37, Luke Palmer wrote: On Tue, Mar 2, 2010 at 7:17 AM, Simon Marlowmarlo...@gmail.com wrote: For games, though, we have a very good point that occurs regularly where we know that all/most short-lived objects will no longer be referenced - at the start of a fresh frame. System.Mem.performGC is your friend, but if you're unlucky it might do a major GC and then you'll get more pause than you bargained for. Some fine-grained control might be nice here. Eg. I could do a major GC as a player is opening a menu, on a loading screen, when the game is paused, or some other key points, and it might still be annoying, but at least it wouldn't interfere with gameplay. There is of course the question of what happens if one of these key points doesn't happen when we need to do an allocation, but... oh well. Perhaps that could be mitigated by saying I would rather you allocate than major GC right now. Are any of these options impossible, or be unreasonably difficult to implement (I don't suspect so)? Actually that's one thing we can do relatively easily, i.e. defer major GC for a while. Due to the way GHC has a two-layer memory manager, the heap is a list of discontiguous blocks, so we can always allocate some more memory. So it would be pretty easy to provide something like disableMajorGC, enableMajorGC :: IO () Of course leaving it disabled too long could be bad, but that's your responsibility. Oh, I just checked and System.Mem.performGC actually performs a major GC, here's its implementation: foreign import ccall {-safe-} performMajorGC performGC :: IO () to perform a minor GC (or possibly a major GC if one is due), you want this: foreign import ccall {-safe-} performGC performMinorGC :: IO () Cheers, Simon ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe Is there a similar set of runes to be able to see how much mutation has occurred, how much was live last GC, etc? Neil ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Real-time garbage collection for Haskell
On 03/03/2010 08:43, Neil Davies wrote: On 2 Mar 2010, at 21:38, Simon Marlow wrote: On 02/03/10 20:37, Luke Palmer wrote: On Tue, Mar 2, 2010 at 7:17 AM, Simon Marlowmarlo...@gmail.com wrote: For games, though, we have a very good point that occurs regularly where we know that all/most short-lived objects will no longer be referenced - at the start of a fresh frame. System.Mem.performGC is your friend, but if you're unlucky it might do a major GC and then you'll get more pause than you bargained for. Some fine-grained control might be nice here. Eg. I could do a major GC as a player is opening a menu, on a loading screen, when the game is paused, or some other key points, and it might still be annoying, but at least it wouldn't interfere with gameplay. There is of course the question of what happens if one of these key points doesn't happen when we need to do an allocation, but... oh well. Perhaps that could be mitigated by saying I would rather you allocate than major GC right now. Are any of these options impossible, or be unreasonably difficult to implement (I don't suspect so)? Actually that's one thing we can do relatively easily, i.e. defer major GC for a while. Due to the way GHC has a two-layer memory manager, the heap is a list of discontiguous blocks, so we can always allocate some more memory. So it would be pretty easy to provide something like disableMajorGC, enableMajorGC :: IO () Of course leaving it disabled too long could be bad, but that's your responsibility. Oh, I just checked and System.Mem.performGC actually performs a major GC, here's its implementation: foreign import ccall {-safe-} performMajorGC performGC :: IO () to perform a minor GC (or possibly a major GC if one is due), you want this: foreign import ccall {-safe-} performGC performMinorGC :: IO () Cheers, Simon ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe Is there a similar set of runes to be able to see how much mutation has occurred, how much was live last GC, etc? No, but there could be - the information is collected by the RTS, it just isn't made available via a public API right now. Cheers, Simon ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Real-time garbage collection for Haskell
Both concurrent GC and incremental GC tend to add overheads to the mutator, because they need a read barrier. There was an incremental GC for GHC once [1], taking advantage of the built-in read barrier that we have whereby most closures are entered Was there a specific reason why that GC implementation chose to use a read barrier rather than a write barrier? I would have thought that in general, a write barrier is cheaper to implement. Doesn't ghc update fewer thunks than it enters? Regards, Malcolm ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Real-time garbage collection for Haskell
On Sun, Feb 28, 2010 at 10:03 AM, Heinrich Apfelmus apfel...@quantentunnel.de wrote: Luke Palmer wrote: I have seen some proposals around here for SoC projects and other things to try to improve the latency of GHC's garbage collector. I'm currently developing a game in Haskell, and even 100ms pauses are unacceptable for a real-time game. I'm calling out to people who have seen or made such proposals, because I would be willing to contribute funding and/or mentor a project that would contribute to this goal. Also any ideas for reducing this latency in other ways would be very appreciated. Overly long garbage collection might also be a sign of space leaks. But there are many other things that can go wrong in a real time system and might explain your delays. For example, you might need to avoid amortized time data structures like Data.Sequence . Or for physics simulations, you'd need to fix the time step ∆t, as described in http://gafferongames.com/game-physics/fix-your-timestep/ or numerical integration will deteriorate rather quickly. Incidentally, what's described there is a simplified version of the frequency locked loops described in Massalin's thesis on Synthesis OS, and it is used there for about the same purpose in a soft real-time context. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.29.4871 ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe