Re: [GHC] #650: Improve interaction between mutable arrays and GC

Jan-Willem Maessen Tue, 24 Jan 2006 07:42:50 -0800


On Jan 11, 2006, at 6:21 AM, GHC wrote:

#650: Improve interaction between mutable arrays and GC
-------------------------------+--------------------------------------------
    Reporter:  simonmar        |        Owner:
        Type:  task            |       Status:  new
    Priority:  normal          |    Milestone:  6.6
   Component:  Runtime System  |      Version:  6.4.1
    Severity:  normal          |     Keywords:
          Os:  Unknown         |   Difficulty:  Moderate (1 day)
Architecture:  Unknown         |
-------------------------------+--------------------------------------------This is the result of a discussion between myself and Simon PJ afew weeks
 ago, inspired by recent discoveries of poor performance with mutable
 arrays (eg. see Data.Hash discussion on glasgow-haskell-users around
 October 2005).
Note that all this applies to mutable arrays of pointers, i.e.`IOArray`and `STArray`, ''not'' to unboxed arrays, i.e. `IOUArray` and`STUArray`.
...
 === The Problem ===

 The problem is that mutable arrays are always completely traversed on
every GC. To get around this, we can keep an array in a frozenstate andthaw it just before writing, then freeze it again afterward. Thisis abit inconvenient, not to mention unsafe with multiple threadsunless extra
 locking is used.

Yes. I'd love it if "freeze" and "thaw" just affected types plus(perhaps) the ability for mutations to occur in the first place (sothat I don't freeze in one thread and mutate in another).

Given that virtually every object access in GHC involves a hiddenread barrier (entering a thunk is effectively a read barrier whichchecks computedness), I'm not too worried about the costs of a writebarrier for mutation in the IO monad. I'd welcome some up-front costto avoid the GC problems.

 === Solutions ===

You should think seriously about keeping a write list for arrays asan alternative to cards or a full scan, as Bulat suggested. Thereare card/write list hybrids.

If you put a card mark in the block table, there's no particular needto modify the array representation beyond adding a "dirty" bit. Thewrite barrier would look something like this:

  * Set array's dirty bit
  * Set card mark on appropriate block
  * Store pointer into array
Then during GC we scan the dirty blocks of dirty arrays.

Small arrays will fit in a block and will be scanned completely.Larger arrays will span more than one block and benefit from the cardmarks. But there's no need to actually distinguish these cases ifyou're using cards.

Note also that we can keep multiple card flags in the blockdescriptor, so there's actually opportunity for tuning here. It'strading off up-front bit-twiddling for lower GC overhead, though.

Do bear in mind that it's big arrays which really seem to be causingthe trouble here. So I think some solution which does workproportional to the amount of mutation is in order.

For IORefs the game is utterly different; I'm not sure what to doabout them. You'd like to avoid traversing a list of all the IORefsat each collection. Very tricky. Here it might be worth scanningjust the dirty blocks.


Perhaps there ought to be an IORef list per block?

-Jan-Willem Maessen

[Who's very interested in seeing this solved well...]

_______________________________________________
Glasgow-haskell-bugs mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs

Re: [GHC] #650: Improve interaction between mutable arrays and GC

Reply via email to