Re: Observable GC

Filip Pizlo Thu, 26 Oct 2017 09:19:42 -0700

> On Oct 26, 2017, at 8:44 AM, Steve Fink <[email protected]> wrote:
> 
>> On 10/20/17 10:52 AM, Filip Pizlo wrote:
>> 
>> 
>>> On Oct 20, 2017, at 10:29 AM, Mark Miller <[email protected]> wrote:
>>> 
>>> There is a glaring inconsistency in the criteria we use to evaluate these 
>>> issues. While we are understandably reluctant to admit more non-determinism 
>>> into the platform via weakrefs, we have admitted an astonishingly greater 
>>> degree of non-determinism into the platform via "Shared Array Buffers" 
>>> (SAB), i.e., shared memory multithreading with data races.
>>> 
>>> The scenario we legitimately fear for weakrefs: A developer writes code 
>>> that is not correct according to the weakref specification but happens to 
>>> work on present implementations. Perhaps it relies on something being 
>>> collected that is not guaranteed to be collected.
>> 
>> Being collected when it shouldn’t have been?  Like a dangling reference.  
>> The game theory of security exploits forces implementations to keep things 
>> alive longer, not shorter.
>> 
>>> Perhaps it relies on something not being collected that is not guaranteed 
>>> not to be collected. A later correct change to an implementation, or 
>>> another correct implementation, causes that code to break. The game theory 
>>> punishes the correct implementation rather than the incorrect code.
>> 
>> Java had weak refs and multiple different implementations.  My claim, as 
>> someone who implemented lots of weird GC algorithms in Java, is that I don’t 
>> know of a case where different weak ref semantics breaks something.  The 
>> only time that getting it wrong ever had an observably bad effect is when 
>> you break weak refs or finalizers so badly that they never get cleared or 
>> called, and then some resource has an unbounded leak.  This usually results 
>> in not being able to run any benchmarks that have weak references or 
>> finalizers, so you fix those bugs pretty quickly.
>> 
>> Here are the motivations:
>> - Competitive perf motivates GC authors to try to free things as soon as 
>> possible.  Smaller heaps mean more speed.  Some benchmarks won’t run to 
>> completion if you aren’t aggressive enough.
> I don't follow this. My GC optimization work usually pushes in the opposite 
> direction -- scanning less, not more (but hopefully not *collecting* much 
> less). We [spidermonkey] partition the heap in all kinds of ways so we don't 
> have to collect the whole thing all the time. It's partitioned into 
> processes, the processes have thread-local heaps, and the thread-local heaps 
> are partitioned into independently-collectable zones specific to various 
> purposes (in the web browser, they're for tabs, iframes, and some internal 
> things.) It doesn't seem unlikely to have a weakref in a lightly-used zone 
> pointing into a more active zone. So yes, we'd aggressively collect the 
> pointee zone to keep the heap small, but scanning the pointer zones is a 
> waste of time. And we're always looking for more ways to subdivide the heap, 
> given that the overhead of GC is mostly determined by the amount of live 
> stuff you have to scan.
> 
> Generational GC similarly partitions the heap, for the same reason. If 
> nothing is surviving minor GCs, you won't bother doing any of the major GCs 
> that would collect the weakref pointees. I have considered (and I believe 
> other engines have implemented) having more generations, by splitting off 
> very long-lived (or alternatively, observed to be read-only) portions of the 
> tenured heap and not scanning those during most major GCs. (I haven't looked 
> enough to decide whether the extra cost and complexity of the write barriers 
> is worth it for us at this point.)
> 
I think you missed my point. I never said that GCs should scan more things. 
> That said, we *could* treat a weakref as a cue to collect the source and 
> destination zones together. Which would mean weakrefs would be something of a 
> "go slow" bit, but it might help contain the damage.
>> - The need to avoid dangling references forces us to keep alive at least 
>> what we need to, and sometimes a bit more.
>> 
>> I guess a program could rely on the weak references actually being strong in 
>> some implementation.  I haven’t heard of Java programs ever doing that.  
>> It’s unlikely to happen because the major implementations will try to clear 
>> weak refs as aggressively as they can to compete on benchmarks.
> GC-related competition on benchmarks gets really silly without anyone even 
> intentionally gaming things. I remember making a minor improvement and seeing 
> a benchmark score absolutely plummet. I tracked it down to the benchmark 
> having a "warmup" phase to each subtest (which is important for eliminating 
> variance that prevents detecting small changes, so it's not a wholly bad 
> thing). The minor change shifted several GCs from the unmeasured warmup phase 
> into the measured phase. At which point I realized that much of our parameter 
> tuning, in particular, had been having the effect of hiding GCs under the 
> covers, not necessarily speeding anything up. If a benchmark score started 
> depending on the promptness of weakref cleanup, then you're right, we'll 
> probably end up messing up our heap partitioning to satisfy whatever 
> arbitrary object graph it's testing, at a cost to real-world performance.
> 
Using benchmarks effectively is a skill. 
> We have multiple benchmarks and treat in-the-field telemetry as the gold 
> standard, so the worst of the effects get caught eventually. The length of 
> the feedback loop matters, though.
> 
>

_______________________________________________
es-discuss mailing list
[email protected]
https://mail.mozilla.org/listinfo/es-discuss

Re: Observable GC

Reply via email to