Okay. So now for concurrent object forwarding. That is: forwarding performed by a collector while the mutator is active.
The first think to say is that I don't think a read barrier is sufficient. All of the counting operations and optimizations look to me like they can be done by a write barrier. Forwarding is very different. The problem as I noted before, is that the mutator may perform reads from the wrong location while forwarding is underway, and there is really no effective way to compile this problem out. I think the solution is to observe that the only reason we are forwarding is to clear a nearly free block. If that is so, then forwarding is a block-grain operation, and step one is to remap the source block to a safe place, leading to faults on both reads and writes to the old block location. Having done that, we now have two options. The first is to make optimistic copies of all live objects. If a mutator incurs a page fault during copy, we remove the forwarding pointers and release the mutator to run. The source block is hot, clearing the block has failed, and we tiptoe quietly back to get out of the way. Since the target copies were made to contiguous space, we should be able to roll back the bump allocator in this situation. The second option is to force the blocked mutator into a binary dynamic translator that will soft-translate around the block we are moving. Naively: it dynamically translates the mutator code stream in such a way that all memory references are sandboxed, and references into the frozen block are soft-translated to the new object locations. Dynamic translation continues until the collector's lock on the source block is released, whereupon control is returned to the main path of the mutator. Dynamic translation also ends at a safe point, because the mutator thread can update its own references at that point and stop referencing the source block. The worst that happens is that the mutator faults again and gets pushed back into the dynamic translator. The source block remains unmapped until all pointers into the (now empty) source block have been forwarded. Approaches one and two can be blended, with dynamic translation used as a fallback when the source block is hot. A third option is to do nacl-like code emission in the first place, but the overhead of that seems high. It has the advantage that block remapping is not necessary. Whether it makes sense depends on how much block clearing we do. A fourth option would be to use dynamic binary translation routinely, because the HDTrans way of doing that carries very little overhead. That's potentially interesting, because it would allow us to add pseudo-barrier instructions to the input stream and selectively compile these in or out at need. A fifth option would be to rely on a loose rendezvous, in which all mutators must enter dynamic translation before the collectors can start migrating. This eliminates the need to fiddle the mapping structures. End of dump. Shap
_______________________________________________ bitc-dev mailing list [email protected] http://www.coyotos.org/mailman/listinfo/bitc-dev
