Re: Concerns about weak refs and weak maps.

2010-10-29 Thread Erik Corry
I share Rick's worry about weak maps.  It's not clear how to make an
efficient GC in the presence of weak maps.

As Brendan rightly points out there is already a lot of complexity in
the GC of a typical JS implementation because it is interacting with
the (typically) reference counted, C++ based DOM.  Adding complexity
in the form of a new weak reference model looks like it could limit JS
implementations severely down the road.

One of the hallmarks of a real language implementation vs. a 'toy
scripting language' ala PHP is a good GC.  I'd really like to see
someone do a low-latency GC with weak maps before we hobble the
language with something that can't be implemented efficiently.  By
modern I mean generational and either parallel or concurrent.  (In a
parallel GC there are several threads doing GC in parallel while the
actual JS program is stopped.  In a concurrent GC there is one or more
threads doing GC while the JS program is still making progress).

It is true that the language is not multithreaded and never will be.
This does not limit the implementation from making use of several
threads/CPUs when doing garbage collection.  We actually have a V8
prototype that has a parallel GC
http://codereview.chromium.org/3615009/show though probably it won't
be landed in its current form.  We will likely be investigating a
slightly different approach.  But in principle this is quite possible:
 a single threaded JS implementation where part of the runtime is
multithreaded.

Just for clarity, since Brendan brought up the limitations of the V8
GC:  At the moment the V8 GC is accurate (not conservative), moving
(resistent to fragmentation) and generational (most pauses are small).
 But there is plenty of room for improvement in terms of very large
heaps, worst-case pause times and making use of multiple CPUs.

Let's be careful not to add things to the language that will limit
implementations from getting the sorts of modern GC implementations
that Java and .NET have.

-- 
Erik Corry
___
es-discuss mailing list
es-discuss@mozilla.org
https://mail.mozilla.org/listinfo/es-discuss


Re: Concerns about weak refs and weak maps.

2010-10-29 Thread Brendan Eich
On Oct 29, 2010, at 12:08 AM, Erik Corry wrote:

 One of the hallmarks of a real language implementation vs. a 'toy
 scripting language' ala PHP is a good GC.  I'd really like to see
 someone do a low-latency GC with weak maps before we hobble the
 language with something that can't be implemented efficiently.

Any chance you guys will implement the WeakMap proposal?


 By modern I mean generational and either parallel or concurrent.

Parallel vs. concurrent GC is a good distinction to bring up. We're interested 
in parallel GC too, and we have a WeakMap prototype under way, so we'll have to 
report back in due course.

Still, it seems premature to rule WeakMaps out right now. Rather than strangle 
them in the cradle, I hope several browser-based engines can implement and see 
what we find out. The use-cases in the language remain pretty compelling, 
without a good fallback strategy (leaky strong maps? No object reference keyed 
maps at all?).

Concurrent is a different animal from parallel, and I remain skeptical about JS 
implementations  deviating significantly from shared-nothing concurrency, even 
as the JS embeddings use threads under the hood to utilize more cores.

/be

___
es-discuss mailing list
es-discuss@mozilla.org
https://mail.mozilla.org/listinfo/es-discuss


Re: Concerns about weak refs and weak maps.

2010-10-29 Thread Erik Corry
2010/10/29 Brendan Eich bren...@mozilla.com:
 On Oct 29, 2010, at 12:08 AM, Erik Corry wrote:

 One of the hallmarks of a real language implementation vs. a 'toy
 scripting language' ala PHP is a good GC.  I'd really like to see
 someone do a low-latency GC with weak maps before we hobble the
 language with something that can't be implemented efficiently.

 Any chance you guys will implement the WeakMap proposal?

I can't see Google's V8 team doing that, but anyone can take V8 and do
experiments on their branch of it.

 By modern I mean generational and either parallel or concurrent.

 Parallel vs. concurrent GC is a good distinction to bring up. We're 
 interested in parallel GC too, and we have a WeakMap prototype under way, so 
 we'll have to report back in due course.

 Still, it seems premature to rule WeakMaps out right now. Rather than 
 strangle them in the cradle, I hope several browser-based engines can 
 implement and see what we find out. The use-cases in the language remain 
 pretty compelling, without a good fallback strategy (leaky strong maps? No 
 object reference keyed maps at all?).

 Concurrent is a different animal from parallel, and I remain skeptical about 
 JS implementations  deviating significantly from shared-nothing concurrency, 
 even as the JS embeddings use threads under the hood to utilize more cores.

 /be


___
es-discuss mailing list
es-discuss@mozilla.org
https://mail.mozilla.org/listinfo/es-discuss


RE: Concerns about weak refs and weak maps.

2010-10-29 Thread Allen Wirfs-Brock
Rick and Erik,

I share your concerns about the importance of efficient GC for JavaScript.  
However, based upon direct experience, I'm not particularly concerned about the 
impact of Weak maps.  The ES Harmony Weak Maps is derived from the Ephemeron 
concept first developed at Digitalk.  We first implemented them in the 
production GC for Visual Smalltalk and later in ParcPlace-Digitalk (now Cincom) 
VisualWorks Smalltalk.  These were both high performance, low latency, 
accurate, moving,  multi-generational collectors that have now been in 
production use for over 15 years.  In general, Smalltalk application seem to be 
are much more demanding upon a GC then most modern JavaScript applications (and 
correspondingly, my perceptions is that JavaSript GCs have not yet reached the 
sophistication and throughput levels that were typical of Smalltalk GCs).

While the above Smalltalk GCs are multi-generational, neither utilized any 
parallelism.  They clearly demonstrate that parallelism is not necessary to 
achieve low pause times in highly interactive GC'ed systems. Regardless, modern 
collectors exploiting parallelism exist for both Java and .Net that support 
various weak pointer abstractions.  I'm comfortable that they provide a 
sufficient existence proof for efficient parallel collectors supporting weak 
reference semantics.

It is unfortunate, that the use of WeakMaps/Weak pointers/Ephemerons do impose 
some overhead that is proportional to their  use. But the use of weak 
abstractions generally shows up in non-ephemeral use cases so this really isn't 
a problem. Still, they are a feature that shouldn't be used unnecessarily. 
However, when they are necessary, there really isn't any other good alternative.

I agree that current JavaScript GCs have a lot of issues that  derive from 
interactions with C/C++ based DOMs.  I think the long term solution is to 
allocate those objects in the JavaScript GC'ed heap.  My slogan from a recent 
talk (http://bit.ly/ahiGS1 ): Build a great GC and then use it everywhere.

Allen


___
es-discuss mailing list
es-discuss@mozilla.org
https://mail.mozilla.org/listinfo/es-discuss


Re: Concerns about weak refs and weak maps.

2010-10-29 Thread P T Withington
On 2010-10-28, at 17:10, Hudson, Rick wrote:

 GC implementation
 Software stacks on multicores will need GCs to become increasingly concurrent 
 and latency free. Weak maps cause concurrent GCs and additional mutator 
 synchronizations.  While a concurrent GC already has to do some 
 synchronization, each additional synchronization point impedes scalability. 
 My concern is that the number of synchronization points might be proportional 
 to the number of k, v pairs in the weak maps (see below).

This is old information, but perhaps relevant.  Basically handling weak 
references in a standard hardware by emulating what a Lisp Machine would have 
done  [From http://bit.ly/b4xidb]

[...]

AWL has another special power: it enables better handing of barrier hits on 
weak objects. To explain the benefit we need to describe a problem first. The 
MPS uses a read-barrier to perform incremental garbage collection [@@ link to 
canned-for-client explanation of how a garbage collector works in broad terms]. 
When the client tries to read an object containing weak references the MPS may 
have protected it so that the MPS can process the object before the client gets 
to see it. The problem for weak objects is that the client may try and access 
the object at a point in the collection cycle when the MPS cannot yet determine 
the status of the objects that the weak object refers to. What the MPS does in 
this situation is assume that all the referenced objects are going to live. 
This assumption is correct but conservative; it may result in objects that are 
weakly referenced staying alive for longer than they need to. In the worst case 
this can result in a very large amount of memor
 y being used by objects that are no longer needed.

In order to combat this problem the MPS sometimes does the following: Instead 
of processing the entire weak object and unprotecting it so that the client can 
access the object the MPS may emulate the processor instruction. When this 
happens the MPS doesn't process the entire weak object, it only processes the 
exact location that was being accessed (typically a single word), it emulates 
the processor instruction, and it keeps the object protected. This happens 
invisibly from the client's perspective, it's exactly as if the instruction 
executed as normal. The MPS instead of processing the entire object processes 
just a single word.

[...]


___
es-discuss mailing list
es-discuss@mozilla.org
https://mail.mozilla.org/listinfo/es-discuss


Re: Concerns about weak refs and weak maps.

2010-10-29 Thread Brendan Eich
On Oct 29, 2010, at 3:57 AM, Erik Corry wrote:

 2010/10/29 Brendan Eich bren...@mozilla.com:
 On Oct 29, 2010, at 12:08 AM, Erik Corry wrote:
 
 One of the hallmarks of a real language implementation vs. a 'toy
 scripting language' ala PHP is a good GC.  I'd really like to see
 someone do a low-latency GC with weak maps before we hobble the
 language with something that can't be implemented efficiently.

Complex problems don't go away by renouncing any in language solution. The 
complexity moves to the pure-JS level and whatever solution can be hacked 
there, which then has to be downloaded, and which has bad performance or 
memory-leak hazards. This then tends to produce non-standard native 
implementations in various engines. Just noting this, more below.


 Any chance you guys will implement the WeakMap proposal?
 
 I can't see Google's V8 team doing that, but anyone can take V8 and do
 experiments on their branch of it.

Cool, maybe we can find a volunteer. Google has projects using JS to manage 
objects, where I believe in at least one case (Caja), WeakMaps would be a 
significant win. Such use-cases are not unique to Google, they're common in JS 
libraries and virtualization schemes. But my first thought is to scour Google 
for people who need WeakMaps and who could implement an experimental patch for 
V8.

As people have noted in this thread, you can't really make up for the lack of 
WeakMaps in JS currently without falling into O(n^2) complexity traps (arrays 
of objects, linear search). The strong references in such maps make for 
guaranteed memory leaks. If there's another alternative that lacks both GC 
complexity and user-facing leak and performance issues, we are all ears. Doing 
nothing is likely to bid forth extensions in browser JS implementations, which 
may lead to de-facto standardization.

We can keep experimenting with WeakMaps, but if we don't have an understanding 
of why demand for something like them exists and is likely to create a standard 
of some sort, we won't have a fruitful discussion, and we probably won't serve 
JS users well.

Renouncing WeakMaps certainly would keep GC implementations, e.g., in V8, 
simpler, but that's too narrow a point of view for Ecma TC39. And if Apple and 
other vendors do implement some kind of weak map, market pressure may force V8 
to follow. This is not the Harmony way to proceed, since it tends to make 
more non-interoperation and developer pain in the market, and take longer.

Nevertheless, it sounds like we should reconsider the move of WeakMaps into 
harmony status (http://wiki.ecmascript.org/doku.php?id=harmony:weak_maps) 
based on your feedback, moreso than Rick's due to your representing the V8 
team. We can move the proposal back to strawman status, pending more results of 
experiments (plural, so we need help; SpiderMonkey is not enough). Is this your 
intention?

/be
___
es-discuss mailing list
es-discuss@mozilla.org
https://mail.mozilla.org/listinfo/es-discuss


Re: Concerns about weak refs and weak maps.

2010-10-29 Thread Brendan Eich
On Oct 29, 2010, at 12:08 AM, Erik Corry wrote:

 Let's be careful not to add things to the language that will limit
 implementations from getting the sorts of modern GC implementations
 that Java and .NET have.


Yet don't Java and .NET have weak-keyed maps of some sort?

/be
___
es-discuss mailing list
es-discuss@mozilla.org
https://mail.mozilla.org/listinfo/es-discuss


Re: Concerns about weak refs and weak maps.

2010-10-29 Thread felix

On 10/29/10 10:51, Brendan Eich wrote:

On Oct 29, 2010, at 12:08 AM, Erik Corry wrote:


Let's be careful not to add things to the language that will limit
implementations from getting the sorts of modern GC implementations
that Java and .NET have.



Yet don't Java and .NET have weak-keyed maps of some sort?


Java has WeakHashMap, which is a map of weak references to values, which 
can leak uncollectible garbage if you have a cycle of WeakHashMap 
entries holding references to objects that are not otherwise referenced.


___
es-discuss mailing list
es-discuss@mozilla.org
https://mail.mozilla.org/listinfo/es-discuss


RE: Concerns about weak refs and weak maps.

2010-10-29 Thread Allen Wirfs-Brock
 -Original Message-
 From: es-discuss-boun...@mozilla.org [mailto:es-discuss-
 boun...@mozilla.org] On Behalf Of felix
 Sent: Friday, October 29, 2010 12:18 PM
 To: Brendan Eich
...
 
  Yet don't Java and .NET have weak-keyed maps of some sort?
 
 Java has WeakHashMap, which is a map of weak references to values, which
 can leak uncollectible garbage if you have a cycle of WeakHashMap entries
 holding references to objects that are not otherwise referenced.

While is the problem addressed by the proposed ephemeron-like Weak Map solution 
for ECMAScript 
___
es-discuss mailing list
es-discuss@mozilla.org
https://mail.mozilla.org/listinfo/es-discuss


RE: Concerns about weak refs and weak maps.

2010-10-29 Thread Allen Wirfs-Brock
Oops, typo fixed  below

 -Original Message-
 From: es-discuss-boun...@mozilla.org [mailto:es-discuss-
 boun...@mozilla.org] On Behalf Of Allen Wirfs-Brock
 Sent: Friday, October 29, 2010 12:24 PM
 To: felix; Brendan Eich
 Cc: Andreas Gal; es-discuss
 Subject: RE: Concerns about weak refs and weak maps.
 
  -Original Message-
  From: es-discuss-boun...@mozilla.org [mailto:es-discuss-
  boun...@mozilla.org] On Behalf Of felix
  Sent: Friday, October 29, 2010 12:18 PM
  To: Brendan Eich
 ...
  
   Yet don't Java and .NET have weak-keyed maps of some sort?
 
  Java has WeakHashMap, which is a map of weak references to values,
  which can leak uncollectible garbage if you have a cycle of
  WeakHashMap entries holding references to objects that are not otherwise
 referenced.
 
Which is the problem addressed by the proposed ephemeron-like Weak Map solution 
for ECMAScript

___
es-discuss mailing list
es-discuss@mozilla.org
https://mail.mozilla.org/listinfo/es-discuss


RE: Concerns about weak refs and weak maps.

2010-10-29 Thread Jim Deville
C# 4.0 has WeakMap's too, and one of the goals was to work around the cyclical 
reference issue:

http://blogs.msdn.com/b/clrteam/archive/2009/05/18/the-conditional-weak-table-enabling-dynamic-object-properties.aspx



-Original Message-
From: es-discuss-boun...@mozilla.org [mailto:es-discuss-boun...@mozilla.org] On 
Behalf Of Allen Wirfs-Brock
Sent: Friday, October 29, 2010 3:04 PM
To: Allen Wirfs-Brock; felix; Brendan Eich
Cc: Andreas Gal; es-discuss
Subject: RE: Concerns about weak refs and weak maps.

Oops, typo fixed  below

 -Original Message-
 From: es-discuss-boun...@mozilla.org [mailto:es-discuss- 
 boun...@mozilla.org] On Behalf Of Allen Wirfs-Brock
 Sent: Friday, October 29, 2010 12:24 PM
 To: felix; Brendan Eich
 Cc: Andreas Gal; es-discuss
 Subject: RE: Concerns about weak refs and weak maps.
 
  -Original Message-
  From: es-discuss-boun...@mozilla.org [mailto:es-discuss- 
  boun...@mozilla.org] On Behalf Of felix
  Sent: Friday, October 29, 2010 12:18 PM
  To: Brendan Eich
 ...
  
   Yet don't Java and .NET have weak-keyed maps of some sort?
 
  Java has WeakHashMap, which is a map of weak references to values, 
  which can leak uncollectible garbage if you have a cycle of 
  WeakHashMap entries holding references to objects that are not 
  otherwise
 referenced.
 
Which is the problem addressed by the proposed ephemeron-like Weak Map solution 
for ECMAScript

___
es-discuss mailing list
es-discuss@mozilla.org
https://mail.mozilla.org/listinfo/es-discuss

___
es-discuss mailing list
es-discuss@mozilla.org
https://mail.mozilla.org/listinfo/es-discuss


Re: Concerns about weak refs and weak maps.

2010-10-29 Thread felix

On 10/29/10 15:18, Jim Deville wrote:

C# 4.0 has WeakMap's too, and one of the goals was to work around the cyclical 
reference issue:

http://blogs.msdn.com/b/clrteam/archive/2009/05/18/the-conditional-weak-table-enabling-dynamic-object-properties.aspx


mono implements ConditionalWeakTable with Ephemeron tables registered 
with the GC.  The registration does nothing with boehm-gc, but there is 
special handling for Ephemerons in the experimental sgen-gc, and it 
looks to me like it's similar to what's described on the ECMAScript wiki 
for WeakMap: repeatedly scan Ephemeron key,value entries for values that 
are newly reachable, until no more newly-reachable values are found.


http://github.com/mono/mono/blob/09ae7f243df64f6928ffcd89eb44ac1f688c689e/mono/metadata/sgen-gc.c

it's instrumented to count how many times it repeats the Ephemeron scan, 
so it probably wouldn't be too hard to get practical data on how 
expensive it is in real usage.

___
es-discuss mailing list
es-discuss@mozilla.org
https://mail.mozilla.org/listinfo/es-discuss


Re: Concerns about weak refs and weak maps.

2010-10-29 Thread Brendan Eich
On Oct 29, 2010, at 4:01 PM, felix wrote:

 On 10/29/10 15:18, Jim Deville wrote:
 C# 4.0 has WeakMap's too, and one of the goals was to work around the 
 cyclical reference issue:
 
 http://blogs.msdn.com/b/clrteam/archive/2009/05/18/the-conditional-weak-table-enabling-dynamic-object-properties.aspx
 
 mono implements ConditionalWeakTable with Ephemeron tables registered with 
 the GC.  The registration does nothing with boehm-gc, but there is special 
 handling for Ephemerons in the experimental sgen-gc, and it looks to me like 
 it's similar to what's described on the ECMAScript wiki for WeakMap: 
 repeatedly scan Ephemeron key,value entries for values that are newly 
 reachable, until no more newly-reachable values are found.
 
 http://github.com/mono/mono/blob/09ae7f243df64f6928ffcd89eb44ac1f688c689e/mono/metadata/sgen-gc.c
 
 it's instrumented to count how many times it repeats the Ephemeron scan, so 
 it probably wouldn't be too hard to get practical data on how expensive it is 
 in real usage.


We have a patch in progress for SpiderMonkey, in this bug:

https://bugzilla.mozilla.org/show_bug.cgi?id=547941

It won't be in Firefox 4.

Instrumentation is easy, the hard part (as ever) is workload selection. That 
would be good to build up, in advance of implementations.

/be
___
es-discuss mailing list
es-discuss@mozilla.org
https://mail.mozilla.org/listinfo/es-discuss


Concerns about weak refs and weak maps.

2010-10-28 Thread Hudson, Rick
I have concerns about weak refs and weak maps, one concern related to 
usability, one related to performance bugs, and another other related to 
changing the complexity of the GC algorithm.

Usability
Weak maps act as a GC API and in doing so violate GC's greatest strength:  the 
user does not have to be concerned with memory management.  Since the user can 
never observe that a key/value pair has been deleted,  weak maps are much 
better than weak refs but they still seem capable of causing hard to diagnose 
and fix performance bugs related to memory pressure and the particular GC 
implementation.

GC implementation
Software stacks on multicores will need GCs to become increasingly concurrent 
and latency free. Weak maps cause concurrent GCs and additional mutator 
synchronizations.  While a concurrent GC already has to do some 
synchronization, each additional synchronization point impedes scalability. My 
concern is that the number of synchronization points might be proportional to 
the number of k, v pairs in the weak maps (see below).

GC Complexity
Another potential performance bug is that the complexity of the Abstract GC 
Algorithm at http://wiki.ecmascript.org/doku.php?id=harmony:weak_maps appears 
to be O(n**2), since each loop may only discover a single retained key k from 
(k,v) whose walk of the related value v discovers only a single unretained k. 
This would then be repeated and require rescanning all the weak maps in the 
each of the following iterations. Concurrent GCs would probable require one or 
more mutator synchronizations at each iteration.

My sense is that the value of a simple efficient concurrent GC and reduced 
latency is higher than the value of weak maps so weak maps should not be part 
of the standard.


-  Rick



___
es-discuss mailing list
es-discuss@mozilla.org
https://mail.mozilla.org/listinfo/es-discuss


Re: Concerns about weak refs and weak maps.

2010-10-28 Thread Brendan Eich
On Oct 28, 2010, at 2:10 PM, Hudson, Rick wrote:

 I have concerns about weak refs and weak maps, one concern related to 
 usability, one related to performance bugs, and another other related to 
 changing the complexity of the GC algorithm.

Hi Rick, your comments are on target, but I'll differ on some fine points 
below. Thanks for writing.


 Usability
 Weak maps act as a GC API and in doing so violate GC’s greatest strength:  
 the user does not have to be concerned with memory management.  Since the 
 user can never observe that a key/value pair has been deleted,  weak maps are 
 much better than weak refs but they still seem capable of causing hard to 
 diagnose and fix performance bugs related to memory pressure and the 
 particular GC implementation.

This is all true, but developers are already concerned with memory management. 
GC is good, I like it, but it is not and never will be perfect (enough). Even 
V8 is facing criticism in its nodejs.org embedding, for among other things 
having too-small heap limits. Node users are retracing the JVM user footprints 
and asking for an explicit library call, gc(), to force collection.

So weakmaps can't be the breaking point that throws concern over memeory 
management via GC into users' faces. It's an issue on the client side too, 
already (and for many years, mostly due to cycles crossing through the DOM, 
which is under separate, often ref-counted, management, back to JS via the 
global object containing a closure capturing its environment, including 
document).

Plus, weakmaps give JS bridge-builders and compiler/runtime writers a needed 
tool, without which the only options are too costly (linear search of an array 
matching object reference identity) and leak-prone (strong references only).


  GC implementation
 Software stacks on multicores will need GCs to become increasingly concurrent 
 and latency free. Weak maps cause concurrent GCs and additional mutator 
 synchronizations.  While a concurrent GC already has to do some 
 synchronization, each additional synchronization point impedes scalability. 
 My concern is that the number of synchronization points might be proportional 
 to the number of k, v pairs in the weak maps (see below).

Browsers do not multithread JS, the language has no such execution model. More, 
it won't get one. Nodejs uses a single thread for all requests, requiring 
continuation passing style coding. There are alternatives, along Erlang lines, 
but they do not involve shared mutable state. So no global, concurrent GC is in 
the cards, on client or server -- my best projection (and strong advice!).


  GC Complexity
 Another potential performance bug is that the complexity of the Abstract GC 
 Algorithm at http://wiki.ecmascript.org/doku.php?id=harmony:weak_maps appears 
 to be O(n**2), since each loop may only discover a single retained key k from 
 (k,v) whose walk of the related value v discovers only a single unretained k. 
 This would then be repeated and require rescanning all the weak maps in the 
 each of the following iterations. Concurrent GCs would probable require one 
 or more mutator synchronizations at each iteration.

Ignoring the concurrent GC concern, which I believe does not apply to JS 
because JS won't have threads and shared mutable state, you're right. Weak maps 
make GC potentially costly. Andreas Gal has an implementation in progress for 
Firefox and he can comment on this point.


  My sense is that the value of a simple efficient concurrent GC and reduced 
 latency is higher than the value of weak maps so weak maps should not be part 
 of the standard.

If you remove the concurrent GC concern, then what is your sense?

/be

___
es-discuss mailing list
es-discuss@mozilla.org
https://mail.mozilla.org/listinfo/es-discuss


RE: Concerns about weak refs and weak maps.

2010-10-28 Thread Hudson, Rick
Hi Brendan,

I think we all agree that EcmaScript, the language, should not contain threads 
with shared memory constructs such as locks or transactions. We seem to 
disagree as to whether EcmaScript, the implementation, should be multithreaded. 
If bridge builders, compiler writers, GC implementations, need better 
(non-application visible) tools then running their tools concurrently seems the 
way forward. Web workers that share immutable heap object behind a message 
passing façade could provide reasonable performance without changing the 
(HTML5) standard. Optimizers running in a separate thread also seem like a good 
idea.  Moving the scans of weak maps into a stop the world GC only makes the GC 
more intrusive while removing the application writers ability to optimize their 
applications.

In the GC literature 50 milliseconds latency is considered the upper bound if 
one wishes to provide an acceptable client experience.  10 milliseconds is what 
one wants for a game experience. This is increasingly hard to achieve with a 
stop the world collector as the memory foot print grows while CPU speed does 
not.

In the long run I believe that EcmaScript will not be able to compete with 
other languages in providing highly responsive client side applications without 
a concurrent GC.

But to answer your direct question:
 If you remove the concurrent GC concern, then what is your sense?
GC latency will increase as memory footprint and use of weak maps increases. 
Large application using weak maps will have performance bugs that will end up 
on the GC implementer's desk where they will be hard to diagnose and fix.

Perhaps if folks talk privately to the top half dozen Java GC implementers 
EcmaScipt might be able to avoid some of their pain.


-  Rick



From: Brendan Eich [mailto:bren...@mozilla.com]
Sent: Thursday, October 28, 2010 2:39 PM
To: Hudson, Rick
Cc: es-discuss; Andreas Gal
Subject: Re: Concerns about weak refs and weak maps.

On Oct 28, 2010, at 2:10 PM, Hudson, Rick wrote:


I have concerns about weak refs and weak maps, one concern related to 
usability, one related to performance bugs, and another other related to 
changing the complexity of the GC algorithm.

Hi Rick, your comments are on target, but I'll differ on some fine points 
below. Thanks for writing.



Usability
Weak maps act as a GC API and in doing so violate GC's greatest strength:  the 
user does not have to be concerned with memory management.  Since the user can 
never observe that a key/value pair has been deleted,  weak maps are much 
better than weak refs but they still seem capable of causing hard to diagnose 
and fix performance bugs related to memory pressure and the particular GC 
implementation.

This is all true, but developers are already concerned with memory management. 
GC is good, I like it, but it is not and never will be perfect (enough). Even 
V8 is facing criticism in its nodejs.orghttp://nodejs.org embedding, for 
among other things having too-small heap limits. Node users are retracing the 
JVM user footprints and asking for an explicit library call, gc(), to force 
collection.

So weakmaps can't be the breaking point that throws concern over memeory 
management via GC into users' faces. It's an issue on the client side too, 
already (and for many years, mostly due to cycles crossing through the DOM, 
which is under separate, often ref-counted, management, back to JS via the 
global object containing a closure capturing its environment, including 
document).

Plus, weakmaps give JS bridge-builders and compiler/runtime writers a needed 
tool, without which the only options are too costly (linear search of an array 
matching object reference identity) and leak-prone (strong references only).



 GC implementation
Software stacks on multicores will need GCs to become increasingly concurrent 
and latency free. Weak maps cause concurrent GCs and additional mutator 
synchronizations.  While a concurrent GC already has to do some 
synchronization, each additional synchronization point impedes scalability. My 
concern is that the number of synchronization points might be proportional to 
the number of k, v pairs in the weak maps (see below).

Browsers do not multithread JS, the language has no such execution model. More, 
it won't get one. Nodejs uses a single thread for all requests, requiring 
continuation passing style coding. There are alternatives, along Erlang lines, 
but they do not involve shared mutable state. So no global, concurrent GC is in 
the cards, on client or server -- my best projection (and strong advice!).



 GC Complexity
Another potential performance bug is that the complexity of the Abstract GC 
Algorithm at http://wiki.ecmascript.org/doku.php?id=harmony:weak_maps appears 
to be O(n**2), since each loop may only discover a single retained key k from 
(k,v) whose walk of the related value v discovers only a single unretained k. 
This would then be repeated and require rescanning all

Re: Concerns about weak refs and weak maps.

2010-10-28 Thread Brendan Eich
On Oct 28, 2010, at 5:41 PM, Hudson, Rick wrote:

 Hi Brendan,
  
 I think we all agree that EcmaScript, the language, should not contain 
 threads with shared memory constructs such as locks or transactions. We seem 
 to disagree as to whether EcmaScript, the implementation, should be 
 multithreaded. If bridge builders, compiler writers, GC implementations, need 
 better (non-application visible) tools

[hit |Pause| here]

What's an application? It is hard to make a-priori judgments such as 
non-application visible when we're talking about JS user code. It could be 
part of the app, or in a library, but the point is, it's written in JS. Not C++.


 then running their tools concurrently seems the way forward.

See above -- the bridges, etc. are pure JS, and by the agreement, no threads in 
JS. Ergo no overt concurrency.

I didn't pick on it, but your closing-paragraph phrase simple efficient 
concurrent GC is as far as I can tell an unsolved research problem. The Go 
language talks about such a GC, but its implementation lacks one at present, 
AFAIK. Hans Boehm has worked for a long time on researching some pair-wise 
combinations of words in that phrase, but all four? Not yet, and not in sight 
from where I sit.

Efficient, concurrent GC is achievable, but it is far from simple, and if 
you can avoid it, your runtime is ceteris paribus simpler, smaller, less buggy, 
and possibly quite concurrent enough for your workloads and hardware 
parallelism.

Shared nothing is a wonderful thing. Concurrent and (in the limit) distributed 
GC are hairy. Copying is relatively cheap.

There will be domain-specific problems where copying is too expensive (too much 
data), and some lightweight snapshotting system, some particular solution to 
the view/update problem, with shared immutable data, is needed. But the 
solutions tend to be quite optimized and specialized, and do not amount to 
efficient, concurrent, and general-purpose GC.

(This is certainly our position at Mozilla Research in building Rust -- 
http://blog.mozilla.com/graydon/2010/10/02/rust-progress/.)


 Web workers that share immutable heap object behind a message passing façade 
 could provide reasonable performance without changing the (HTML5) standard.

HTML5 has structured cloning -- 
http://www.whatwg.org/specs/web-apps/current-work/multipage/urls.html#structured-clone
 -- and it is going into Firefox 4.

Again, this, like other parts of current browser codebases, is implemented in 
C++, not JS. Even with a better systems programming language than C++, which JS 
is not a candidate to become, JS doesn't obviously need efficient, concurrent 
GC. V8 has a great generational copying GC, but it is single-threaded and that 
is a significant part of its win.


 Optimizers running in a separate thread also seem like a good idea.

Sure, but these are not going to be implemented in JS in high-performance ways 
that depend on shared memory, even if immutable. This does not make the case 
for concurrent GC trumping the need for weak maps in pure-JS-bridges, runtimes, 
virtualization layers, security membranes, etc.


  Moving the scans of weak maps into a stop the world GC only makes the GC 
 more intrusive while removing the application writers ability to optimize 
 their applications.

I think we disagree on the inevitability of concurrent GC that trumps pure-JS 
weak map requirements.


 In the GC literature 50 milliseconds latency is considered the upper bound if 
 one wishes to provide an acceptable client experience.  10 milliseconds is 
 what one wants for a game experience. This is increasingly hard to achieve 
 with a stop the world collector as the memory foot print grows while CPU 
 speed does not.

The state of the art with JS today is a single-threaded, generational copying 
collector. We don't even have an existence proof of an efficient concurrent GC 
(leave aside simple). We don't have more than one WeakMap implementation, and 
the one we have is not (yet) in a generational GC setting.

So it's premature to throw trump cards, and the no-threads-ever-in-JS agreement 
casts doubt on the need for concurrent GC even under the hood. We're looking at 
more specialized optimizations to shared-nothing that are strictly simpler.


 In the long run I believe that EcmaScript will not be able to compete with 
 other languages in providing highly responsive client side applications 
 without a concurrent GC.

I'll take the other side of this bet.


  But to answer your direct question:
  If you remove the concurrent GC concern, then what is your sense?
 GC latency will increase as memory footprint and use of weak maps increases. 
 Large application using weak maps will have performance bugs that will end up 
 on the GC implementer’s desk where they will be hard to diagnose and fix.

True, WeakMaps up the ante, but nowhere near to the complexity of efficient, 
concurrent GC -- which does not seem to be on anyone's drawing boards, yet we 
have more than a few