Jan-Willem Maessen wrote:
On Jan 11, 2006, at 6:21 AM, GHC wrote:
#650: Improve interaction between mutable arrays and GC
-------------------------------
+--------------------------------------------
Reporter: simonmar | Owner:
Type: task | Status: new
Priority: normal | Milestone: 6.6
Component: Runtime System | Version: 6.4.1
Severity: normal | Keywords:
Os: Unknown | Difficulty: Moderate (1 day)
Architecture: Unknown |
-------------------------------
+--------------------------------------------
This is the result of a discussion between myself and Simon PJ a few
weeks
ago, inspired by recent discoveries of poor performance with mutable
arrays (eg. see Data.Hash discussion on glasgow-haskell-users around
October 2005).
Note that all this applies to mutable arrays of pointers, i.e.
`IOArray`
and `STArray`, ''not'' to unboxed arrays, i.e. `IOUArray` and
`STUArray`.
...
=== The Problem ===
The problem is that mutable arrays are always completely traversed on
every GC. To get around this, we can keep an array in a frozen
state and
thaw it just before writing, then freeze it again afterward. This is a
bit inconvenient, not to mention unsafe with multiple threads unless
extra
locking is used.
Yes. I'd love it if "freeze" and "thaw" just affected types plus
(perhaps) the ability for mutations to occur in the first place (so
that I don't freeze in one thread and mutate in another).
Given that virtually every object access in GHC involves a hidden read
barrier (entering a thunk is effectively a read barrier which checks
computedness), I'm not too worried about the costs of a write barrier
for mutation in the IO monad. I'd welcome some up-front cost to avoid
the GC problems.
I probably should have mentioned this to you, but we've implemented a
partial solution to this already:
http://www.haskell.org//pipermail/cvs-ghc/2006-January/028009.html
http://www.haskell.org//pipermail/cvs-ghc/2006-January/028010.html
You should think seriously about keeping a write list for arrays as an
alternative to cards or a full scan, as Bulat suggested. There are
card/write list hybrids.
If you put a card mark in the block table, there's no particular need
to modify the array representation beyond adding a "dirty" bit. The
write barrier would look something like this:
* Set array's dirty bit
* Set card mark on appropriate block
* Store pointer into array
Then during GC we scan the dirty blocks of dirty arrays.
I quite agree, and this is exactly what I'd planned to do (eventually).
Having a dirty bit per array in addition to the per-block bit has two
benefits: it means you don't have to look at every block descriptor in a
clean array, and secondly when there are multiple small arrays in a
single block we won't end up scanning all of them if one of them is
written to.
Small arrays will fit in a block and will be scanned completely.
Larger arrays will span more than one block and benefit from the card
marks. But there's no need to actually distinguish these cases if
you're using cards.
Note also that we can keep multiple card flags in the block descriptor,
so there's actually opportunity for tuning here. It's trading off
up-front bit-twiddling for lower GC overhead, though.
Do bear in mind that it's big arrays which really seem to be causing
the trouble here. So I think some solution which does work
proportional to the amount of mutation is in order.
For IORefs the game is utterly different; I'm not sure what to do about
them. You'd like to avoid traversing a list of all the IORefs at each
collection. Very tricky. Here it might be worth scanning just the
dirty blocks.
The IORef write barrier we implemented seems to work pretty well: when
an IORef is written to, it is put on the mutable list iff it was
previously clean. The write barrier didn't decrease performance
measurably for a program that did a lot of IORef manipulations, but the
GC load was much reduced (see nofib/gc_bench).
Cheers,
Simon
_______________________________________________
Glasgow-haskell-bugs mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs