Jan-Willem Maessen wrote:

On Jan 11, 2006, at 6:21 AM, GHC wrote:

#650: Improve interaction between mutable arrays and GC
------------------------------- +--------------------------------------------
    Reporter:  simonmar        |        Owner:
        Type:  task            |       Status:  new
    Priority:  normal          |    Milestone:  6.6
   Component:  Runtime System  |      Version:  6.4.1
    Severity:  normal          |     Keywords:
          Os:  Unknown         |   Difficulty:  Moderate (1 day)
Architecture:  Unknown         |
------------------------------- +-------------------------------------------- This is the result of a discussion between myself and Simon PJ a few weeks
 ago, inspired by recent discoveries of poor performance with mutable
 arrays (eg. see Data.Hash discussion on glasgow-haskell-users around
 October 2005).

Note that all this applies to mutable arrays of pointers, i.e. `IOArray` and `STArray`, ''not'' to unboxed arrays, i.e. `IOUArray` and `STUArray`.
...
 === The Problem ===

 The problem is that mutable arrays are always completely traversed on
every GC. To get around this, we can keep an array in a frozen state and
 thaw it just before writing, then freeze it again afterward.  This  is a
bit inconvenient, not to mention unsafe with multiple threads unless extra
 locking is used.


Yes. I'd love it if "freeze" and "thaw" just affected types plus (perhaps) the ability for mutations to occur in the first place (so that I don't freeze in one thread and mutate in another).

Given that virtually every object access in GHC involves a hidden read barrier (entering a thunk is effectively a read barrier which checks computedness), I'm not too worried about the costs of a write barrier for mutation in the IO monad. I'd welcome some up-front cost to avoid the GC problems.

I probably should have mentioned this to you, but we've implemented a partial solution to this already:

  http://www.haskell.org//pipermail/cvs-ghc/2006-January/028009.html
  http://www.haskell.org//pipermail/cvs-ghc/2006-January/028010.html

You should think seriously about keeping a write list for arrays as an alternative to cards or a full scan, as Bulat suggested. There are card/write list hybrids.

If you put a card mark in the block table, there's no particular need to modify the array representation beyond adding a "dirty" bit. The write barrier would look something like this:
  * Set array's dirty bit
  * Set card mark on appropriate block
  * Store pointer into array
Then during GC we scan the dirty blocks of dirty arrays.

I quite agree, and this is exactly what I'd planned to do (eventually).

Having a dirty bit per array in addition to the per-block bit has two benefits: it means you don't have to look at every block descriptor in a clean array, and secondly when there are multiple small arrays in a single block we won't end up scanning all of them if one of them is written to.

Small arrays will fit in a block and will be scanned completely. Larger arrays will span more than one block and benefit from the card marks. But there's no need to actually distinguish these cases if
you're using cards.

Note also that we can keep multiple card flags in the block descriptor, so there's actually opportunity for tuning here. It's trading off up-front bit-twiddling for lower GC overhead, though.

Do bear in mind that it's big arrays which really seem to be causing the trouble here. So I think some solution which does work proportional to the amount of mutation is in order.

For IORefs the game is utterly different; I'm not sure what to do about them. You'd like to avoid traversing a list of all the IORefs at each collection. Very tricky. Here it might be worth scanning just the dirty blocks.

The IORef write barrier we implemented seems to work pretty well: when an IORef is written to, it is put on the mutable list iff it was previously clean. The write barrier didn't decrease performance measurably for a program that did a lot of IORef manipulations, but the GC load was much reduced (see nofib/gc_bench).

Cheers,
        Simon
_______________________________________________
Glasgow-haskell-bugs mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs

Reply via email to