Okay. So now for concurrent object forwarding. That is: forwarding
performed by a collector while the mutator is active.

The first think to say is that I don't think a read barrier is sufficient.
All of the counting operations and optimizations look to me like they can
be done by a write barrier. Forwarding is very different. The problem as I
noted before, is that the mutator may perform reads from the wrong location
while forwarding is underway, and there is really no effective way to
compile this problem out.

I think the solution is to observe that the only reason we are forwarding
is to clear a nearly free block. If that is so, then forwarding is a
block-grain operation, and step one is to remap the source block to a safe
place, leading to faults on both reads and writes to the old block location.

Having done that, we now have two options. The first is to make optimistic
copies of all live objects. If a mutator incurs a page fault during copy,
we remove the forwarding pointers and release the mutator to run. The
source block is hot, clearing the block has failed, and we tiptoe quietly
back to get out of the way. Since the target copies  were made to
contiguous space, we should be able to roll back the bump allocator in this
situation.

The second option is to force the blocked mutator into a binary dynamic
translator that will soft-translate around the block we are moving.
Naively: it dynamically translates the mutator code stream in such a way
that all memory references are sandboxed, and references into the frozen
block are soft-translated to the new object locations. Dynamic translation
continues until the collector's lock on the source block is released,
whereupon control is returned to the main path of the mutator. Dynamic
translation also ends at a safe point, because the mutator thread can
update its own references at that point and stop referencing the source
block. The worst that happens is that the mutator faults again and gets
pushed back into the dynamic translator.

The source block remains unmapped until all pointers into the (now empty)
source block have been forwarded.

Approaches one and two can be blended, with dynamic translation used as a
fallback when the source block is hot.

A third option is to do nacl-like code emission in the first place, but the
overhead of that seems high. It has the advantage that block remapping is
not necessary. Whether it makes sense depends on how much block clearing we
do.

A fourth option would be to use dynamic binary translation routinely,
because the HDTrans way of doing that carries very little overhead. That's
potentially interesting, because it would allow us to add pseudo-barrier
instructions to the input stream and selectively compile these in or out at
need.

A fifth option would be to rely on a loose rendezvous, in which all
mutators must enter dynamic translation before the collectors can start
migrating. This eliminates the need to fiddle the mapping structures.

End of dump.


Shap
_______________________________________________
bitc-dev mailing list
[email protected]
http://www.coyotos.org/mailman/listinfo/bitc-dev

Reply via email to