On Mon, Nov 11, 2013 at 9:25 AM, Jason Orendorff <[email protected]>wrote:
> On Fri, Nov 8, 2013 at 1:35 PM, Mark S. Miller <[email protected]> wrote: > (re: weakrefs and post-mortem finalization) > > They are needed for many other things, such as > > distributed acyclic garbage collection (as in adapting the CapTP ideas to > > distributed JS). > > I'm not convinced acyclic distributed GC is a good thing to support. > > JS users do not want RPC systems where one process's memory usage > depends on getting per-object callbacks from an untrusted peer's GC > implementation. > Some will. I do. See <http://research.google.com/pubs/pub40673.html>. Why do you believe manual deallocation decisions will be easier in distributed systems than they are locally? If anything, local manual deallocation should be easier, and these have already proven hard enough that people (except C++ programmers) have turned to local GC. You are correct that a distributed mutually suspicious system must support manual deallocation as well. Your Erlang example is quite telling: Erlang does have strong cross process references, the process id. However, because they are forgeable, processes cannot be garbage collected. The decision to terminate a process is the decision to preemptively terminate service to clients that may still exist. Sometimes this needs to be done, even with GC, because the client causes the service to retain more memory than the service wishes to continue to devote to this client. As the Erlang example also indicates, the natural unit for such preemptive manual deallocation decisions is the vat/worker/process. However, many clients will engage in honest GC to keep their requirements on service memory low. Many services will not need to cut such clients off because of excessive resource demands. E/CapTP and Cap'n Proto have an additional form of manual deallocation decision besides vat termination. Between pair of vats there are both "offline references" which survive partition and "live references" with last only up to partition. Because offline references can be reconnected after a partition, they are not subject to GC. Instead, E provides three hooks for manually deallocating them. From Chapter 17.4 "Persistence" of < http://erights.org/talks/thesis/markm-thesis.pdf>: The operations for making an offline capability provide three options for ending this obligation: It can expire at a chosen future date, giving the association a time-to-live. It can expire when explicitly cancelled, making the association revocable. And it can expire when the hosting vat incarnation crashes, making the association transient. An association which is not transient is durable. Since vats must be prepared for inter-vat partition, a vat can preemptively induce a partition with a counterparty vat, in order to preemptively sever the live references between them, forcing reconnection to rely on the offline references subject to the above manual policies. In E/CapTP and Cap'n Proto, the distributed GC governs only these transient live refs, which substantially reduces the pressure to preemptively sever these connections. (The NodeKen system on which Dr. SES will be built does not make this split, forcing it to rely on manually deallocation of vats rather than connections, in order to manually reclaim memory. There is a place in the world for each failure model. Here I am arguing only for the CapTP/Cap'n Proto failure model.) Ultimately, the only principled solution for distributed storage management among mutually suspicious machines is some form of quid pro quo, such as the *market-sweep algorithms* < http://e-drexler.com/d/09/00/AgoricsPapers/agoricpapers/ie/ie3.html>. But even after 25 years, these still seem premature. Distributed GC + preemptive deallocation for extreme conditions, either of vats or connections, is a great compromise in the meantime. > There are already many ways to drop stuff from one process when > another process probably doesn't need it anymore. It doesn't require > nondeterministic language features. Consider, in the simplest case, > "session data" (the capability in question is represented on the wire > as an HTTP cookie) that expires on a timer. Or IPDL's managed > hierarchy of actors > < > https://developer.mozilla.org/en-US/docs/IPDL/Tutorial#Subprotocols_and_Protocol_Management_ > >, > where all references across a given link form a hierarchy, and a whole > subtree can be dropped with a single message. This approach reduces > traffic as well as opportunities for leak-inducing errors; and it's > totally deterministic. Or consider Erlang—one of the best designs for > distributed computing has no strong cross-process references at all. > > -j > _______________________________________________ > es-discuss mailing list > [email protected] > https://mail.mozilla.org/listinfo/es-discuss > -- Cheers, --MarkM
_______________________________________________ es-discuss mailing list [email protected] https://mail.mozilla.org/listinfo/es-discuss

