Re: [whatwg] Combining the DedicatedWorker and SharedWorker interfaces

Alexey Proskuryakov Fri, 14 Nov 2008 01:43:53 -0800


Nov 14, 2008, в 3:59 AM, Ian Hickson написал(а):

For the sake of completeness, a connect/startConversation method on a
worker really should automatically open the receiving port - this is
what examples posted so far implied, and it would cause a lot of
aggravation if it didn't. I know I'm often forgetting to open theport
when writing my tests, and it's not a very easy mistake to spot.
What do you mean by "open the port"? Do you mean calling start()? Ifso,that should happen automatically when you set onmessage the firsttime,
per spec.

Oh, that's my mistake - I totally didn't expect that it could havesuch side effect. It seems weird that addEventListener("message", ...)does not have such effect, does it?

In an async processing model, there is simply no way for thereceiver to
have a list of all objects that were posted to it - it's exactly the
reason for the existence of the queue that events are delivered
asynchronously and cannot be peeked before being delivered. Forexample,
in a multi-process implementation, these events may still be across
process boundary.
It actually doesn't really matter if there is something that has been
posted but not yet received, because that is indistinguishable (asfar asI can tell) from the case of the worker having shut down a splitsecond
before that object was posted.

I'm not sure what state you mean by "shut down" here - the spec doesnot define this, and shutting down a side of an async communicationchannel is complicated (see e.g. a TCP/IP state diagram). Anyway, thecontents of "the worker's ports" is used for defining "active neededworker" and "suspendable worker" further on, which are concepts thatare very important for worker lifetime definition. If the ports inevent queue are not important, then the spec should not say that theyare included in "the worker's ports". This would resolve theconcurrency problem, but I don't think that the resulting behaviorwould be desirable.

It is not possible to have a symmetric relationship in anasynchronousmessaging model - we need a multi-step entagling/unentanglingprotocol,so the relationship is necessarily asymmetric. One can't freezeanother
process (or really, even another thread) to change something in it
synchronously.
The above is not a requirement, it's just a description of theconcept. Idon't think anything actually depends on it being symmetric; all theparts
that actually entangle ports have (or, are intended to have, maybe I
missed some) pretty well-defined synchronisation points.

OK, say there is a pair of entangled ports in different threads/processes, portA and portB. We concurrently post both withpostMessage, which causes the ports to be cloned. From the point ofview of first thread, PortA is now unentangled, and portA' isentangled with portB. From the point of view of second thread, PortBis unentangled, and portB' is entangled with portA.

Next, threads send asynchronous notifications to each other, asking toupdate entangling information. First thread's notification asks portBto become entangled with portA'. So, portB will need to forward thisnotification to portB' (and possibly further, because portB' may havebeen posted and cloned again). This already is unduly complicated.

Now consider that all these ports need to have destroyed sooner orlater, but not too soon. This basically means that we now have a many-to-many distributed GC system. It was bad enough when we had togarbage protect ports between threads, because this requiredmodification of the JavaScript interpreter to support a certain caseof distributed GC. But this example basically shows that we need afull-blown distributed GC system in order to implement port cloning.

For example, any method that entangles two ports blocks until boththreads are synchronised
and entangled.

This will cause deadlocks - if portB' is sent to the first thread asportB'' in the above scheme, the lock will not let synchronizationever finish.

(The spec is somewhat implicit about this, but the intent is thatworkers
really be implemented either as two system threads, one doing
communication and one running the JS, or by one system thread thatrunsthe JS in an interruptible fashion. In particular, doing somethingthatsynchronises with a worker isn't expected to have to wait for thatworker
to finish running its current JS.)

The JS thread will need to be interrupted in any case - we certainlydon't want it to read a half-written pointer from memory or something.Adding memory barriers around access to data that can be modifiedexternally is not sufficient, because MessagePort algorithms are notdesigned in a lock-free fashion (lock-free algorithms that only relyon read/write atomicity do exist, but these aren't such). Lockingaround all MessagePort functions will cause deadlocks, as demonstratedabove, and is generally against best practices. A middle ground mayexist, but it may not, and it's definitely hard to find.

I don't think that pursuing a design that relies on locking isparticularly promising - for the same reason that workers do notexpose shared data to JS programmers, it is highly desirable to notrely on shared data in implementations, too (except for a few wellunderstood constructs, such as an event queue). So, I think that thespecs (Web Workers and HTML5 channel messaging) should be cleaned upfrom anything that mentions synchronous access to entangled port'sdata structures to really be verified for correctness. This is notstraightforward, and may seriously affect the API - e.g., I doubt thatpassing MessagePorts around is implementable with reasonablecomplexity, and there is not a lot of use in MessagePorts if theycannot be passed around.


- WBR, Alexey Proskuryakov

Re: [whatwg] Combining the DedicatedWorker and SharedWorker interfaces

Reply via email to