Re: Concerns about weak refs and weak maps.
I share Rick's worry about weak maps. It's not clear how to make an efficient GC in the presence of weak maps. As Brendan rightly points out there is already a lot of complexity in the GC of a typical JS implementation because it is interacting with the (typically) reference counted, C++ based DOM. Adding complexity in the form of a new weak reference model looks like it could limit JS implementations severely down the road. One of the hallmarks of a real language implementation vs. a 'toy scripting language' ala PHP is a good GC. I'd really like to see someone do a low-latency GC with weak maps before we hobble the language with something that can't be implemented efficiently. By modern I mean generational and either parallel or concurrent. (In a parallel GC there are several threads doing GC in parallel while the actual JS program is stopped. In a concurrent GC there is one or more threads doing GC while the JS program is still making progress). It is true that the language is not multithreaded and never will be. This does not limit the implementation from making use of several threads/CPUs when doing garbage collection. We actually have a V8 prototype that has a parallel GC http://codereview.chromium.org/3615009/show though probably it won't be landed in its current form. We will likely be investigating a slightly different approach. But in principle this is quite possible: a single threaded JS implementation where part of the runtime is multithreaded. Just for clarity, since Brendan brought up the limitations of the V8 GC: At the moment the V8 GC is accurate (not conservative), moving (resistent to fragmentation) and generational (most pauses are small). But there is plenty of room for improvement in terms of very large heaps, worst-case pause times and making use of multiple CPUs. Let's be careful not to add things to the language that will limit implementations from getting the sorts of modern GC implementations that Java and .NET have. -- Erik Corry ___ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss
Re: Concerns about weak refs and weak maps.
On Oct 29, 2010, at 12:08 AM, Erik Corry wrote: One of the hallmarks of a real language implementation vs. a 'toy scripting language' ala PHP is a good GC. I'd really like to see someone do a low-latency GC with weak maps before we hobble the language with something that can't be implemented efficiently. Any chance you guys will implement the WeakMap proposal? By modern I mean generational and either parallel or concurrent. Parallel vs. concurrent GC is a good distinction to bring up. We're interested in parallel GC too, and we have a WeakMap prototype under way, so we'll have to report back in due course. Still, it seems premature to rule WeakMaps out right now. Rather than strangle them in the cradle, I hope several browser-based engines can implement and see what we find out. The use-cases in the language remain pretty compelling, without a good fallback strategy (leaky strong maps? No object reference keyed maps at all?). Concurrent is a different animal from parallel, and I remain skeptical about JS implementations deviating significantly from shared-nothing concurrency, even as the JS embeddings use threads under the hood to utilize more cores. /be ___ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss
Re: Concerns about weak refs and weak maps.
2010/10/29 Brendan Eich bren...@mozilla.com: On Oct 29, 2010, at 12:08 AM, Erik Corry wrote: One of the hallmarks of a real language implementation vs. a 'toy scripting language' ala PHP is a good GC. I'd really like to see someone do a low-latency GC with weak maps before we hobble the language with something that can't be implemented efficiently. Any chance you guys will implement the WeakMap proposal? I can't see Google's V8 team doing that, but anyone can take V8 and do experiments on their branch of it. By modern I mean generational and either parallel or concurrent. Parallel vs. concurrent GC is a good distinction to bring up. We're interested in parallel GC too, and we have a WeakMap prototype under way, so we'll have to report back in due course. Still, it seems premature to rule WeakMaps out right now. Rather than strangle them in the cradle, I hope several browser-based engines can implement and see what we find out. The use-cases in the language remain pretty compelling, without a good fallback strategy (leaky strong maps? No object reference keyed maps at all?). Concurrent is a different animal from parallel, and I remain skeptical about JS implementations deviating significantly from shared-nothing concurrency, even as the JS embeddings use threads under the hood to utilize more cores. /be ___ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss
RE: Concerns about weak refs and weak maps.
Rick and Erik, I share your concerns about the importance of efficient GC for JavaScript. However, based upon direct experience, I'm not particularly concerned about the impact of Weak maps. The ES Harmony Weak Maps is derived from the Ephemeron concept first developed at Digitalk. We first implemented them in the production GC for Visual Smalltalk and later in ParcPlace-Digitalk (now Cincom) VisualWorks Smalltalk. These were both high performance, low latency, accurate, moving, multi-generational collectors that have now been in production use for over 15 years. In general, Smalltalk application seem to be are much more demanding upon a GC then most modern JavaScript applications (and correspondingly, my perceptions is that JavaSript GCs have not yet reached the sophistication and throughput levels that were typical of Smalltalk GCs). While the above Smalltalk GCs are multi-generational, neither utilized any parallelism. They clearly demonstrate that parallelism is not necessary to achieve low pause times in highly interactive GC'ed systems. Regardless, modern collectors exploiting parallelism exist for both Java and .Net that support various weak pointer abstractions. I'm comfortable that they provide a sufficient existence proof for efficient parallel collectors supporting weak reference semantics. It is unfortunate, that the use of WeakMaps/Weak pointers/Ephemerons do impose some overhead that is proportional to their use. But the use of weak abstractions generally shows up in non-ephemeral use cases so this really isn't a problem. Still, they are a feature that shouldn't be used unnecessarily. However, when they are necessary, there really isn't any other good alternative. I agree that current JavaScript GCs have a lot of issues that derive from interactions with C/C++ based DOMs. I think the long term solution is to allocate those objects in the JavaScript GC'ed heap. My slogan from a recent talk (http://bit.ly/ahiGS1 ): Build a great GC and then use it everywhere. Allen ___ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss
Re: Concerns about weak refs and weak maps.
On 2010-10-28, at 17:10, Hudson, Rick wrote: GC implementation Software stacks on multicores will need GCs to become increasingly concurrent and latency free. Weak maps cause concurrent GCs and additional mutator synchronizations. While a concurrent GC already has to do some synchronization, each additional synchronization point impedes scalability. My concern is that the number of synchronization points might be proportional to the number of k, v pairs in the weak maps (see below). This is old information, but perhaps relevant. Basically handling weak references in a standard hardware by emulating what a Lisp Machine would have done [From http://bit.ly/b4xidb] [...] AWL has another special power: it enables better handing of barrier hits on weak objects. To explain the benefit we need to describe a problem first. The MPS uses a read-barrier to perform incremental garbage collection [@@ link to canned-for-client explanation of how a garbage collector works in broad terms]. When the client tries to read an object containing weak references the MPS may have protected it so that the MPS can process the object before the client gets to see it. The problem for weak objects is that the client may try and access the object at a point in the collection cycle when the MPS cannot yet determine the status of the objects that the weak object refers to. What the MPS does in this situation is assume that all the referenced objects are going to live. This assumption is correct but conservative; it may result in objects that are weakly referenced staying alive for longer than they need to. In the worst case this can result in a very large amount of memor y being used by objects that are no longer needed. In order to combat this problem the MPS sometimes does the following: Instead of processing the entire weak object and unprotecting it so that the client can access the object the MPS may emulate the processor instruction. When this happens the MPS doesn't process the entire weak object, it only processes the exact location that was being accessed (typically a single word), it emulates the processor instruction, and it keeps the object protected. This happens invisibly from the client's perspective, it's exactly as if the instruction executed as normal. The MPS instead of processing the entire object processes just a single word. [...] ___ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss
Re: Concerns about weak refs and weak maps.
On Oct 29, 2010, at 3:57 AM, Erik Corry wrote: 2010/10/29 Brendan Eich bren...@mozilla.com: On Oct 29, 2010, at 12:08 AM, Erik Corry wrote: One of the hallmarks of a real language implementation vs. a 'toy scripting language' ala PHP is a good GC. I'd really like to see someone do a low-latency GC with weak maps before we hobble the language with something that can't be implemented efficiently. Complex problems don't go away by renouncing any in language solution. The complexity moves to the pure-JS level and whatever solution can be hacked there, which then has to be downloaded, and which has bad performance or memory-leak hazards. This then tends to produce non-standard native implementations in various engines. Just noting this, more below. Any chance you guys will implement the WeakMap proposal? I can't see Google's V8 team doing that, but anyone can take V8 and do experiments on their branch of it. Cool, maybe we can find a volunteer. Google has projects using JS to manage objects, where I believe in at least one case (Caja), WeakMaps would be a significant win. Such use-cases are not unique to Google, they're common in JS libraries and virtualization schemes. But my first thought is to scour Google for people who need WeakMaps and who could implement an experimental patch for V8. As people have noted in this thread, you can't really make up for the lack of WeakMaps in JS currently without falling into O(n^2) complexity traps (arrays of objects, linear search). The strong references in such maps make for guaranteed memory leaks. If there's another alternative that lacks both GC complexity and user-facing leak and performance issues, we are all ears. Doing nothing is likely to bid forth extensions in browser JS implementations, which may lead to de-facto standardization. We can keep experimenting with WeakMaps, but if we don't have an understanding of why demand for something like them exists and is likely to create a standard of some sort, we won't have a fruitful discussion, and we probably won't serve JS users well. Renouncing WeakMaps certainly would keep GC implementations, e.g., in V8, simpler, but that's too narrow a point of view for Ecma TC39. And if Apple and other vendors do implement some kind of weak map, market pressure may force V8 to follow. This is not the Harmony way to proceed, since it tends to make more non-interoperation and developer pain in the market, and take longer. Nevertheless, it sounds like we should reconsider the move of WeakMaps into harmony status (http://wiki.ecmascript.org/doku.php?id=harmony:weak_maps) based on your feedback, moreso than Rick's due to your representing the V8 team. We can move the proposal back to strawman status, pending more results of experiments (plural, so we need help; SpiderMonkey is not enough). Is this your intention? /be ___ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss
Re: Concerns about weak refs and weak maps.
On Oct 29, 2010, at 12:08 AM, Erik Corry wrote: Let's be careful not to add things to the language that will limit implementations from getting the sorts of modern GC implementations that Java and .NET have. Yet don't Java and .NET have weak-keyed maps of some sort? /be ___ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss
Re: Concerns about weak refs and weak maps.
On 10/29/10 10:51, Brendan Eich wrote: On Oct 29, 2010, at 12:08 AM, Erik Corry wrote: Let's be careful not to add things to the language that will limit implementations from getting the sorts of modern GC implementations that Java and .NET have. Yet don't Java and .NET have weak-keyed maps of some sort? Java has WeakHashMap, which is a map of weak references to values, which can leak uncollectible garbage if you have a cycle of WeakHashMap entries holding references to objects that are not otherwise referenced. ___ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss
RE: Concerns about weak refs and weak maps.
-Original Message- From: es-discuss-boun...@mozilla.org [mailto:es-discuss- boun...@mozilla.org] On Behalf Of felix Sent: Friday, October 29, 2010 12:18 PM To: Brendan Eich ... Yet don't Java and .NET have weak-keyed maps of some sort? Java has WeakHashMap, which is a map of weak references to values, which can leak uncollectible garbage if you have a cycle of WeakHashMap entries holding references to objects that are not otherwise referenced. While is the problem addressed by the proposed ephemeron-like Weak Map solution for ECMAScript ___ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss
RE: Concerns about weak refs and weak maps.
Oops, typo fixed below -Original Message- From: es-discuss-boun...@mozilla.org [mailto:es-discuss- boun...@mozilla.org] On Behalf Of Allen Wirfs-Brock Sent: Friday, October 29, 2010 12:24 PM To: felix; Brendan Eich Cc: Andreas Gal; es-discuss Subject: RE: Concerns about weak refs and weak maps. -Original Message- From: es-discuss-boun...@mozilla.org [mailto:es-discuss- boun...@mozilla.org] On Behalf Of felix Sent: Friday, October 29, 2010 12:18 PM To: Brendan Eich ... Yet don't Java and .NET have weak-keyed maps of some sort? Java has WeakHashMap, which is a map of weak references to values, which can leak uncollectible garbage if you have a cycle of WeakHashMap entries holding references to objects that are not otherwise referenced. Which is the problem addressed by the proposed ephemeron-like Weak Map solution for ECMAScript ___ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss
RE: Concerns about weak refs and weak maps.
C# 4.0 has WeakMap's too, and one of the goals was to work around the cyclical reference issue: http://blogs.msdn.com/b/clrteam/archive/2009/05/18/the-conditional-weak-table-enabling-dynamic-object-properties.aspx -Original Message- From: es-discuss-boun...@mozilla.org [mailto:es-discuss-boun...@mozilla.org] On Behalf Of Allen Wirfs-Brock Sent: Friday, October 29, 2010 3:04 PM To: Allen Wirfs-Brock; felix; Brendan Eich Cc: Andreas Gal; es-discuss Subject: RE: Concerns about weak refs and weak maps. Oops, typo fixed below -Original Message- From: es-discuss-boun...@mozilla.org [mailto:es-discuss- boun...@mozilla.org] On Behalf Of Allen Wirfs-Brock Sent: Friday, October 29, 2010 12:24 PM To: felix; Brendan Eich Cc: Andreas Gal; es-discuss Subject: RE: Concerns about weak refs and weak maps. -Original Message- From: es-discuss-boun...@mozilla.org [mailto:es-discuss- boun...@mozilla.org] On Behalf Of felix Sent: Friday, October 29, 2010 12:18 PM To: Brendan Eich ... Yet don't Java and .NET have weak-keyed maps of some sort? Java has WeakHashMap, which is a map of weak references to values, which can leak uncollectible garbage if you have a cycle of WeakHashMap entries holding references to objects that are not otherwise referenced. Which is the problem addressed by the proposed ephemeron-like Weak Map solution for ECMAScript ___ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss ___ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss
Re: Concerns about weak refs and weak maps.
On 10/29/10 15:18, Jim Deville wrote: C# 4.0 has WeakMap's too, and one of the goals was to work around the cyclical reference issue: http://blogs.msdn.com/b/clrteam/archive/2009/05/18/the-conditional-weak-table-enabling-dynamic-object-properties.aspx mono implements ConditionalWeakTable with Ephemeron tables registered with the GC. The registration does nothing with boehm-gc, but there is special handling for Ephemerons in the experimental sgen-gc, and it looks to me like it's similar to what's described on the ECMAScript wiki for WeakMap: repeatedly scan Ephemeron key,value entries for values that are newly reachable, until no more newly-reachable values are found. http://github.com/mono/mono/blob/09ae7f243df64f6928ffcd89eb44ac1f688c689e/mono/metadata/sgen-gc.c it's instrumented to count how many times it repeats the Ephemeron scan, so it probably wouldn't be too hard to get practical data on how expensive it is in real usage. ___ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss
Re: Concerns about weak refs and weak maps.
On Oct 29, 2010, at 4:01 PM, felix wrote: On 10/29/10 15:18, Jim Deville wrote: C# 4.0 has WeakMap's too, and one of the goals was to work around the cyclical reference issue: http://blogs.msdn.com/b/clrteam/archive/2009/05/18/the-conditional-weak-table-enabling-dynamic-object-properties.aspx mono implements ConditionalWeakTable with Ephemeron tables registered with the GC. The registration does nothing with boehm-gc, but there is special handling for Ephemerons in the experimental sgen-gc, and it looks to me like it's similar to what's described on the ECMAScript wiki for WeakMap: repeatedly scan Ephemeron key,value entries for values that are newly reachable, until no more newly-reachable values are found. http://github.com/mono/mono/blob/09ae7f243df64f6928ffcd89eb44ac1f688c689e/mono/metadata/sgen-gc.c it's instrumented to count how many times it repeats the Ephemeron scan, so it probably wouldn't be too hard to get practical data on how expensive it is in real usage. We have a patch in progress for SpiderMonkey, in this bug: https://bugzilla.mozilla.org/show_bug.cgi?id=547941 It won't be in Firefox 4. Instrumentation is easy, the hard part (as ever) is workload selection. That would be good to build up, in advance of implementations. /be ___ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss
Concerns about weak refs and weak maps.
I have concerns about weak refs and weak maps, one concern related to usability, one related to performance bugs, and another other related to changing the complexity of the GC algorithm. Usability Weak maps act as a GC API and in doing so violate GC's greatest strength: the user does not have to be concerned with memory management. Since the user can never observe that a key/value pair has been deleted, weak maps are much better than weak refs but they still seem capable of causing hard to diagnose and fix performance bugs related to memory pressure and the particular GC implementation. GC implementation Software stacks on multicores will need GCs to become increasingly concurrent and latency free. Weak maps cause concurrent GCs and additional mutator synchronizations. While a concurrent GC already has to do some synchronization, each additional synchronization point impedes scalability. My concern is that the number of synchronization points might be proportional to the number of k, v pairs in the weak maps (see below). GC Complexity Another potential performance bug is that the complexity of the Abstract GC Algorithm at http://wiki.ecmascript.org/doku.php?id=harmony:weak_maps appears to be O(n**2), since each loop may only discover a single retained key k from (k,v) whose walk of the related value v discovers only a single unretained k. This would then be repeated and require rescanning all the weak maps in the each of the following iterations. Concurrent GCs would probable require one or more mutator synchronizations at each iteration. My sense is that the value of a simple efficient concurrent GC and reduced latency is higher than the value of weak maps so weak maps should not be part of the standard. - Rick ___ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss
Re: Concerns about weak refs and weak maps.
On Oct 28, 2010, at 2:10 PM, Hudson, Rick wrote: I have concerns about weak refs and weak maps, one concern related to usability, one related to performance bugs, and another other related to changing the complexity of the GC algorithm. Hi Rick, your comments are on target, but I'll differ on some fine points below. Thanks for writing. Usability Weak maps act as a GC API and in doing so violate GC’s greatest strength: the user does not have to be concerned with memory management. Since the user can never observe that a key/value pair has been deleted, weak maps are much better than weak refs but they still seem capable of causing hard to diagnose and fix performance bugs related to memory pressure and the particular GC implementation. This is all true, but developers are already concerned with memory management. GC is good, I like it, but it is not and never will be perfect (enough). Even V8 is facing criticism in its nodejs.org embedding, for among other things having too-small heap limits. Node users are retracing the JVM user footprints and asking for an explicit library call, gc(), to force collection. So weakmaps can't be the breaking point that throws concern over memeory management via GC into users' faces. It's an issue on the client side too, already (and for many years, mostly due to cycles crossing through the DOM, which is under separate, often ref-counted, management, back to JS via the global object containing a closure capturing its environment, including document). Plus, weakmaps give JS bridge-builders and compiler/runtime writers a needed tool, without which the only options are too costly (linear search of an array matching object reference identity) and leak-prone (strong references only). GC implementation Software stacks on multicores will need GCs to become increasingly concurrent and latency free. Weak maps cause concurrent GCs and additional mutator synchronizations. While a concurrent GC already has to do some synchronization, each additional synchronization point impedes scalability. My concern is that the number of synchronization points might be proportional to the number of k, v pairs in the weak maps (see below). Browsers do not multithread JS, the language has no such execution model. More, it won't get one. Nodejs uses a single thread for all requests, requiring continuation passing style coding. There are alternatives, along Erlang lines, but they do not involve shared mutable state. So no global, concurrent GC is in the cards, on client or server -- my best projection (and strong advice!). GC Complexity Another potential performance bug is that the complexity of the Abstract GC Algorithm at http://wiki.ecmascript.org/doku.php?id=harmony:weak_maps appears to be O(n**2), since each loop may only discover a single retained key k from (k,v) whose walk of the related value v discovers only a single unretained k. This would then be repeated and require rescanning all the weak maps in the each of the following iterations. Concurrent GCs would probable require one or more mutator synchronizations at each iteration. Ignoring the concurrent GC concern, which I believe does not apply to JS because JS won't have threads and shared mutable state, you're right. Weak maps make GC potentially costly. Andreas Gal has an implementation in progress for Firefox and he can comment on this point. My sense is that the value of a simple efficient concurrent GC and reduced latency is higher than the value of weak maps so weak maps should not be part of the standard. If you remove the concurrent GC concern, then what is your sense? /be ___ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss
RE: Concerns about weak refs and weak maps.
Hi Brendan, I think we all agree that EcmaScript, the language, should not contain threads with shared memory constructs such as locks or transactions. We seem to disagree as to whether EcmaScript, the implementation, should be multithreaded. If bridge builders, compiler writers, GC implementations, need better (non-application visible) tools then running their tools concurrently seems the way forward. Web workers that share immutable heap object behind a message passing façade could provide reasonable performance without changing the (HTML5) standard. Optimizers running in a separate thread also seem like a good idea. Moving the scans of weak maps into a stop the world GC only makes the GC more intrusive while removing the application writers ability to optimize their applications. In the GC literature 50 milliseconds latency is considered the upper bound if one wishes to provide an acceptable client experience. 10 milliseconds is what one wants for a game experience. This is increasingly hard to achieve with a stop the world collector as the memory foot print grows while CPU speed does not. In the long run I believe that EcmaScript will not be able to compete with other languages in providing highly responsive client side applications without a concurrent GC. But to answer your direct question: If you remove the concurrent GC concern, then what is your sense? GC latency will increase as memory footprint and use of weak maps increases. Large application using weak maps will have performance bugs that will end up on the GC implementer's desk where they will be hard to diagnose and fix. Perhaps if folks talk privately to the top half dozen Java GC implementers EcmaScipt might be able to avoid some of their pain. - Rick From: Brendan Eich [mailto:bren...@mozilla.com] Sent: Thursday, October 28, 2010 2:39 PM To: Hudson, Rick Cc: es-discuss; Andreas Gal Subject: Re: Concerns about weak refs and weak maps. On Oct 28, 2010, at 2:10 PM, Hudson, Rick wrote: I have concerns about weak refs and weak maps, one concern related to usability, one related to performance bugs, and another other related to changing the complexity of the GC algorithm. Hi Rick, your comments are on target, but I'll differ on some fine points below. Thanks for writing. Usability Weak maps act as a GC API and in doing so violate GC's greatest strength: the user does not have to be concerned with memory management. Since the user can never observe that a key/value pair has been deleted, weak maps are much better than weak refs but they still seem capable of causing hard to diagnose and fix performance bugs related to memory pressure and the particular GC implementation. This is all true, but developers are already concerned with memory management. GC is good, I like it, but it is not and never will be perfect (enough). Even V8 is facing criticism in its nodejs.orghttp://nodejs.org embedding, for among other things having too-small heap limits. Node users are retracing the JVM user footprints and asking for an explicit library call, gc(), to force collection. So weakmaps can't be the breaking point that throws concern over memeory management via GC into users' faces. It's an issue on the client side too, already (and for many years, mostly due to cycles crossing through the DOM, which is under separate, often ref-counted, management, back to JS via the global object containing a closure capturing its environment, including document). Plus, weakmaps give JS bridge-builders and compiler/runtime writers a needed tool, without which the only options are too costly (linear search of an array matching object reference identity) and leak-prone (strong references only). GC implementation Software stacks on multicores will need GCs to become increasingly concurrent and latency free. Weak maps cause concurrent GCs and additional mutator synchronizations. While a concurrent GC already has to do some synchronization, each additional synchronization point impedes scalability. My concern is that the number of synchronization points might be proportional to the number of k, v pairs in the weak maps (see below). Browsers do not multithread JS, the language has no such execution model. More, it won't get one. Nodejs uses a single thread for all requests, requiring continuation passing style coding. There are alternatives, along Erlang lines, but they do not involve shared mutable state. So no global, concurrent GC is in the cards, on client or server -- my best projection (and strong advice!). GC Complexity Another potential performance bug is that the complexity of the Abstract GC Algorithm at http://wiki.ecmascript.org/doku.php?id=harmony:weak_maps appears to be O(n**2), since each loop may only discover a single retained key k from (k,v) whose walk of the related value v discovers only a single unretained k. This would then be repeated and require rescanning all
Re: Concerns about weak refs and weak maps.
On Oct 28, 2010, at 5:41 PM, Hudson, Rick wrote: Hi Brendan, I think we all agree that EcmaScript, the language, should not contain threads with shared memory constructs such as locks or transactions. We seem to disagree as to whether EcmaScript, the implementation, should be multithreaded. If bridge builders, compiler writers, GC implementations, need better (non-application visible) tools [hit |Pause| here] What's an application? It is hard to make a-priori judgments such as non-application visible when we're talking about JS user code. It could be part of the app, or in a library, but the point is, it's written in JS. Not C++. then running their tools concurrently seems the way forward. See above -- the bridges, etc. are pure JS, and by the agreement, no threads in JS. Ergo no overt concurrency. I didn't pick on it, but your closing-paragraph phrase simple efficient concurrent GC is as far as I can tell an unsolved research problem. The Go language talks about such a GC, but its implementation lacks one at present, AFAIK. Hans Boehm has worked for a long time on researching some pair-wise combinations of words in that phrase, but all four? Not yet, and not in sight from where I sit. Efficient, concurrent GC is achievable, but it is far from simple, and if you can avoid it, your runtime is ceteris paribus simpler, smaller, less buggy, and possibly quite concurrent enough for your workloads and hardware parallelism. Shared nothing is a wonderful thing. Concurrent and (in the limit) distributed GC are hairy. Copying is relatively cheap. There will be domain-specific problems where copying is too expensive (too much data), and some lightweight snapshotting system, some particular solution to the view/update problem, with shared immutable data, is needed. But the solutions tend to be quite optimized and specialized, and do not amount to efficient, concurrent, and general-purpose GC. (This is certainly our position at Mozilla Research in building Rust -- http://blog.mozilla.com/graydon/2010/10/02/rust-progress/.) Web workers that share immutable heap object behind a message passing façade could provide reasonable performance without changing the (HTML5) standard. HTML5 has structured cloning -- http://www.whatwg.org/specs/web-apps/current-work/multipage/urls.html#structured-clone -- and it is going into Firefox 4. Again, this, like other parts of current browser codebases, is implemented in C++, not JS. Even with a better systems programming language than C++, which JS is not a candidate to become, JS doesn't obviously need efficient, concurrent GC. V8 has a great generational copying GC, but it is single-threaded and that is a significant part of its win. Optimizers running in a separate thread also seem like a good idea. Sure, but these are not going to be implemented in JS in high-performance ways that depend on shared memory, even if immutable. This does not make the case for concurrent GC trumping the need for weak maps in pure-JS-bridges, runtimes, virtualization layers, security membranes, etc. Moving the scans of weak maps into a stop the world GC only makes the GC more intrusive while removing the application writers ability to optimize their applications. I think we disagree on the inevitability of concurrent GC that trumps pure-JS weak map requirements. In the GC literature 50 milliseconds latency is considered the upper bound if one wishes to provide an acceptable client experience. 10 milliseconds is what one wants for a game experience. This is increasingly hard to achieve with a stop the world collector as the memory foot print grows while CPU speed does not. The state of the art with JS today is a single-threaded, generational copying collector. We don't even have an existence proof of an efficient concurrent GC (leave aside simple). We don't have more than one WeakMap implementation, and the one we have is not (yet) in a generational GC setting. So it's premature to throw trump cards, and the no-threads-ever-in-JS agreement casts doubt on the need for concurrent GC even under the hood. We're looking at more specialized optimizations to shared-nothing that are strictly simpler. In the long run I believe that EcmaScript will not be able to compete with other languages in providing highly responsive client side applications without a concurrent GC. I'll take the other side of this bet. But to answer your direct question: If you remove the concurrent GC concern, then what is your sense? GC latency will increase as memory footprint and use of weak maps increases. Large application using weak maps will have performance bugs that will end up on the GC implementer’s desk where they will be hard to diagnose and fix. True, WeakMaps up the ante, but nowhere near to the complexity of efficient, concurrent GC -- which does not seem to be on anyone's drawing boards, yet we have more than a few