On May 6, 2009, at 6:41 PM, Drew Wilson wrote:
Following up. I think I have my head around how Worker GC is
happening (I may start another thread about that, as it looks like
there's some cases where the thread won't be shut down, but the
general design is sound).
MessagePort GC is a little trickier, because we need to detect when
both sides have no external references, based on this part of the
HTML5 spec:
[...] a message port can be received, given an event listener, and
then forgotten, and so long as that event listener could receive a
message, the channel will be maintained.
Of course, if this was to occur on both sides of the channel, then
both ports would be garbage collected, since they would not be
reachable from live code, despite having a strong reference to each
other.
From looking at the code in bindings/js, it looks like I've got two
tools to manage object reachability:
1) I can tell when my object is reachable (during a GC) because
mark() will be invoked on it.
2) I can force my object to stay active (as long as the owning
context is active) by making it an ActiveDOMObject and returning
true from hasPendingActivity() (which seems like it does nothing but
invoke mark() on the object).
So, #2 lets me keep an object alive, but to implement the spec, I
need to be able to detect when my object has no more references,
without actually having it get garbage collected. If I can do that,
then I can build my own distributed state mechanism to allow me to
determine when it's safe to GC the objects.
I'm looking through the JSC::Collector code, and I didn't see
anything that did exactly what I want, but there are probably some
things that we could do with protect() to enable this. Has anyone
else had to do anything like what I describe above? It's not exactly
even a multi-thread issue, as it seems like this problem would occur
even with just a single thread.
It is specifically a multi-thread issue, because with a single thread
and single heap both MessagePorts could just mark() each other - if
they have no other references, they will be collected anyway because
GC will happily collect an unreferenced cycle.
It's only the separate per-thread heaps that make it challenging,
since GC may occur at different times and on separate heaps, so the
two MessagePorts have to protect each other in a persistent way until
both become unreachable.
The best way I can think of to handle this is to have a special phase
after normal marking where objects with an external/cross-thread
reference get marked in a distinctive way. Then each MessagePort would
know if it was marked solely due to its opposite endpoint being live.
I don't recall if there is a way for an unreachable MessagePort to
become reachable - I think yes, because the message event listener can
stuff the MessagePort in a global variable. But I think an unerachable
port can only become reachable by receiving a message. Thus, you need
a core data structure for the MessageChannel which detects the case
that there are no messages pending in either direct and both endpoints
are alive only due to the other endpoint. Something like that. This is
a very rough design sketch, Alexey can probably explain in more detail
or I can study the code.
My impression is that Workers use a similar scheme with a special
additional marking phase, or once did, but Alexey will recall better
than I.
- Maciej
-atw
2009/5/6 Drew Wilson <[email protected]>
Thanks, this puts me on the right track. I've had a bunch of
discussions with the Chrome folks on how we'd track MessagePort
reachability in Chrome, but I'd hoped that the problem might be
simpler in WebKit since we had direct access to the data structures
cross-thread. The existence of separate GC heaps means it's not
particularly simpler after all.
-atw
2009/5/6 Maciej Stachowiak <[email protected]>
On May 6, 2009, at 1:53 PM, Drew Wilson wrote:
OK, that's good to know (it only supports document contexts) -
clearly some work has been done to prepare for multi-thread usage
(for example, the core data structure is a thread-safe MessageQueue).
I'm quite happy to drive this design (in fact, I'm in the middle of
this now) but I would like to make sure I understand in general
what the correct approach is for managing GC-able objects that are
accessed cross-thread - I haven't been able to find any
documentation (outside of the code itself).
Is the right approach to use JSLock when manipulating cross-thread
linkage? I'll write up a quick document to describe the approach
I'm taking, but I'd like to understand your concerns about
deadlocks. So long as we have only a single shared per-channel
mutex, and we never grab any other locks (like JSLock) after
grabbing that mutex, we should be OK. Are there other locks that
may be grabbed behind the scenes that I should be aware of?
JSLock is not the right approach. Workers have their own completely
separate GC heap. JSLock only locks the current context group's
heap. It will not prevent collection in other heaps.
I don't know exactly what the right approach is. Ultimately it's a
distributed GC problem, both for our split-heap multithreading and
for an approach that used processes for workers. And distributed GC
is hard.
However, Worker itself has a similar issue, since it can be kept
alive either from the inside or the outside reference. You could
look at how that problem was solved.
- Maciej
-atw
2009/5/6 Alexey Proskuryakov <[email protected]>
06.05.2009, в 21:38, Drew Wilson написал(а):
It looks like the JSC collection code relies on JSLock to lock the
heap - I'm guessing that I'll need to explicitly grab the JSLock
whenever I'm manipulating the linkage between the two ports, is
that correct? Or is there a different/better way to handle
situations like this?
The JavaScriptCore implementation of MessagePorts only supports
document contexts (i.e., it only works on main thread).
As mentioned earlier, the first thing needed to implement
MessagePorts in workers is a design of how they can be passed
around without breaking GC. It is likely that taking a lock
whenever atomicity is desired will cause deadlocks.
- WBR, Alexey Proskuryakov
_______________________________________________
webkit-dev mailing list
[email protected]
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
_______________________________________________
webkit-dev mailing list
[email protected]
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev