Title: [208897] trunk
Revision
208897
Author
[email protected]
Date
2016-11-18 14:11:51 -0800 (Fri, 18 Nov 2016)

Log Message

Concurrent GC should be able to run splay in debug mode and earley/raytrace in release mode with no perf regression
https://bugs.webkit.org/show_bug.cgi?id=164282

Reviewed by Geoffrey Garen and Oliver Hunt.
        
PerformanceTests:

CDjs is a fun benchmark for stressing concurrent GCs, but to really give the GC a good
workout you need to increase the amount of work that the test does. This adds a second
configuration of the benchmark that has more aircraft. It uses much more memory and causes us
to do more GCs and those GCs take longer.

* JetStream/cdjs/benchmark.js:
(benchmarkImpl):
(benchmark):
* JetStream/cdjs/large.js: Added.

Source/_javascript_Core:

The two three remaining bugs were:

- Improper ordering inside putDirectWithoutTransition() and friends. We need to make sure
  that the GC doesn't see the store to Structure::m_offset until we've resized the butterfly.
  That proved a bit tricky. On the other hand, this means that we could probably remove the
  requirement that the GC holds the Structure lock in some cases. I haven't removed that lock
  yet because I still think it might protect some weird cases, and it doesn't seem to cost us
  anything.
        
- CodeBlock's GC strategy needed to be made thread-safe (visitWeakly, visitChildren, and
  their friends now hold locks) and incremental-safe (we need to update predictions in the
  finalizer to make sure we clear anything that was put into a value profile towards the end
  of GC).
        
- The GC timeslicing scheduler needed to be made a bit more aggressive to deal with
  generational workloads like earley, raytrace, and CDjs. Once I got those benchmarks to run,
  I found that they would do many useless iterations of GC because they wouldn't pause long
  enough after rescanning weak references and roots. I added a bunch of knobs for forcing a
  pause. In the end, I realized that I could get the desired effect by putting a ceiling on
  mutator utilization. We want the GC to finish quickly if it is possible to do so, even if
  the amount of allocation that the mutator had done is low. Having a utilization ceiling
  seems to accomplish this for benchmarks with trivial heaps (earley and raytrace) as well as
  huge heaps (like CDjs in its "large" configuration).
        
This preserves splay performance, makes the concurrent GC more stable, and makes the
concurrent GC not a perf regression on earley or raytrace. It seems to give us great CDjs
performance as well, but this is still hard to tell because we crash a lot in that benchmark.

* bytecode/CodeBlock.cpp:
(JSC::CodeBlock::CodeBlock):
(JSC::CodeBlock::visitWeakly):
(JSC::CodeBlock::visitChildren):
(JSC::CodeBlock::shouldVisitStrongly):
(JSC::CodeBlock::shouldJettisonDueToOldAge):
(JSC::CodeBlock::propagateTransitions):
(JSC::CodeBlock::determineLiveness):
(JSC::CodeBlock::WeakReferenceHarvester::visitWeakReferences):
(JSC::CodeBlock::UnconditionalFinalizer::finalizeUnconditionally):
(JSC::CodeBlock::visitOSRExitTargets):
(JSC::CodeBlock::stronglyVisitStrongReferences):
(JSC::CodeBlock::stronglyVisitWeakReferences):
* bytecode/CodeBlock.h:
(JSC::CodeBlock::clearVisitWeaklyHasBeenCalled):
* heap/CodeBlockSet.cpp:
(JSC::CodeBlockSet::deleteUnmarkedAndUnreferenced):
* heap/Heap.cpp:
(JSC::Heap::ResumeTheWorldScope::ResumeTheWorldScope):
(JSC::Heap::markToFixpoint):
(JSC::Heap::beginMarking):
(JSC::Heap::addToRememberedSet):
(JSC::Heap::collectInThread):
* heap/Heap.h:
* heap/HeapInlines.h:
(JSC::Heap::mutatorFence):
* heap/MarkedBlock.cpp:
* runtime/JSCellInlines.h:
(JSC::JSCell::finishCreation):
* runtime/JSObjectInlines.h:
(JSC::JSObject::putDirectWithoutTransition):
(JSC::JSObject::putDirectInternal):
* runtime/Options.h:
* runtime/Structure.cpp:
(JSC::Structure::add):
* runtime/Structure.h:
* runtime/StructureInlines.h:
(JSC::Structure::add):

Modified Paths

Added Paths

Diff

Modified: trunk/PerformanceTests/ChangeLog (208896 => 208897)


--- trunk/PerformanceTests/ChangeLog	2016-11-18 22:04:43 UTC (rev 208896)
+++ trunk/PerformanceTests/ChangeLog	2016-11-18 22:11:51 UTC (rev 208897)
@@ -1,3 +1,20 @@
+2016-11-18  Filip Pizlo  <[email protected]>
+
+        Concurrent GC should be able to run splay in debug mode and earley/raytrace in release mode with no perf regression
+        https://bugs.webkit.org/show_bug.cgi?id=164282
+
+        Reviewed by Geoffrey Garen and Oliver Hunt.
+        
+        CDjs is a fun benchmark for stressing concurrent GCs, but to really give the GC a good
+        workout you need to increase the amount of work that the test does. This adds a second
+        configuration of the benchmark that has more aircraft. It uses much more memory and causes us
+        to do more GCs and those GCs take longer.
+
+        * JetStream/cdjs/benchmark.js:
+        (benchmarkImpl):
+        (benchmark):
+        * JetStream/cdjs/large.js: Added.
+
 2016-11-14  Filip Pizlo  <[email protected]>
 
         Unreviewed, revert unintended change.

Modified: trunk/PerformanceTests/JetStream/cdjs/benchmark.js (208896 => 208897)


--- trunk/PerformanceTests/JetStream/cdjs/benchmark.js	2016-11-18 22:04:43 UTC (rev 208896)
+++ trunk/PerformanceTests/JetStream/cdjs/benchmark.js	2016-11-18 22:11:51 UTC (rev 208897)
@@ -1,5 +1,5 @@
 // Copyright (c) 2001-2010, Purdue University. All rights reserved.
-// Copyright (C) 2015 Apple Inc. All rights reserved.
+// Copyright (C) 2015-2016 Apple Inc. All rights reserved.
 // 
 // Redistribution and use in source and binary forms, with or without
 // modification, are permitted provided that the following conditions are met:
@@ -23,12 +23,12 @@
 // (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
 // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 
-function benchmark() {
-    var verbosity = 0;
-    var numAircraft = 1000;
-    var numFrames = 200;
-    var expectedCollisions = 14484;
-    var percentile = 95;
+function benchmarkImpl(configuration) {
+    var verbosity = configuration.verbosity;
+    var numAircraft = configuration.numAircraft;
+    var numFrames = configuration.numFrames;
+    var expectedCollisions = configuration.expectedCollisions;
+    var percentile = configuration.percentile;
 
     var simulator = new Simulator(numAircraft);
     var detector = new CollisionDetector();
@@ -78,3 +78,24 @@
     
     return averageAbovePercentile(times, percentile);
 }
+
+function benchmark() {
+    return benchmarkImpl({
+        verbosity: 0,
+        numAircraft: 1000,
+        numFrames: 200,
+        expectedCollisions: 14484,
+        percentile: 95
+    });
+}
+
+function largeBenchmark() {
+    return benchmarkImpl({
+        verbosity: 0,
+        numAircraft: 20000,
+        numFrames: 100,
+        expectedCollisions: 5827,
+        percentile: 95
+    });
+}
+

Added: trunk/PerformanceTests/JetStream/cdjs/large.js (0 => 208897)


--- trunk/PerformanceTests/JetStream/cdjs/large.js	                        (rev 0)
+++ trunk/PerformanceTests/JetStream/cdjs/large.js	2016-11-18 22:11:51 UTC (rev 208897)
@@ -0,0 +1,45 @@
+// Copyright (c) 2001-2010, Purdue University. All rights reserved.
+// Copyright (C) 2016 Apple Inc. All rights reserved.
+// 
+// Redistribution and use in source and binary forms, with or without
+// modification, are permitted provided that the following conditions are met:
+//  * Redistributions of source code must retain the above copyright
+//    notice, this list of conditions and the following disclaimer.
+//  * Redistributions in binary form must reproduce the above copyright
+//    notice, this list of conditions and the following disclaimer in the
+//    documentation and/or other materials provided with the distribution.
+//  * Neither the name of the Purdue University nor the
+//    names of its contributors may be used to endorse or promote products
+//    derived from this software without specific prior written permission.
+// 
+// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+// ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+// WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+// DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER BE LIABLE FOR ANY
+// DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+// (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+// LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+// ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+// (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+// This is run as a JSC stress test. Let the harness know that this is a slow test.
+//@ slow!
+
+load("constants.js");
+load("util.js");
+load("red_black_tree.js");
+load("call_sign.js");
+load("vector_2d.js");
+load("vector_3d.js");
+load("motion.js");
+load("reduce_collision_set.js");
+load("simulator.js");
+load("collision.js");
+load("collision_detector.js");
+load("benchmark.js");
+
+var result = largeBenchmark();
+
+print("Average worst case: " + result + " ms.");
+

Modified: trunk/Source/_javascript_Core/ChangeLog (208896 => 208897)


--- trunk/Source/_javascript_Core/ChangeLog	2016-11-18 22:04:43 UTC (rev 208896)
+++ trunk/Source/_javascript_Core/ChangeLog	2016-11-18 22:11:51 UTC (rev 208897)
@@ -1,3 +1,77 @@
+2016-11-18  Filip Pizlo  <[email protected]>
+
+        Concurrent GC should be able to run splay in debug mode and earley/raytrace in release mode with no perf regression
+        https://bugs.webkit.org/show_bug.cgi?id=164282
+
+        Reviewed by Geoffrey Garen and Oliver Hunt.
+        
+        The two three remaining bugs were:
+
+        - Improper ordering inside putDirectWithoutTransition() and friends. We need to make sure
+          that the GC doesn't see the store to Structure::m_offset until we've resized the butterfly.
+          That proved a bit tricky. On the other hand, this means that we could probably remove the
+          requirement that the GC holds the Structure lock in some cases. I haven't removed that lock
+          yet because I still think it might protect some weird cases, and it doesn't seem to cost us
+          anything.
+        
+        - CodeBlock's GC strategy needed to be made thread-safe (visitWeakly, visitChildren, and
+          their friends now hold locks) and incremental-safe (we need to update predictions in the
+          finalizer to make sure we clear anything that was put into a value profile towards the end
+          of GC).
+        
+        - The GC timeslicing scheduler needed to be made a bit more aggressive to deal with
+          generational workloads like earley, raytrace, and CDjs. Once I got those benchmarks to run,
+          I found that they would do many useless iterations of GC because they wouldn't pause long
+          enough after rescanning weak references and roots. I added a bunch of knobs for forcing a
+          pause. In the end, I realized that I could get the desired effect by putting a ceiling on
+          mutator utilization. We want the GC to finish quickly if it is possible to do so, even if
+          the amount of allocation that the mutator had done is low. Having a utilization ceiling
+          seems to accomplish this for benchmarks with trivial heaps (earley and raytrace) as well as
+          huge heaps (like CDjs in its "large" configuration).
+        
+        This preserves splay performance, makes the concurrent GC more stable, and makes the
+        concurrent GC not a perf regression on earley or raytrace. It seems to give us great CDjs
+        performance as well, but this is still hard to tell because we crash a lot in that benchmark.
+
+        * bytecode/CodeBlock.cpp:
+        (JSC::CodeBlock::CodeBlock):
+        (JSC::CodeBlock::visitWeakly):
+        (JSC::CodeBlock::visitChildren):
+        (JSC::CodeBlock::shouldVisitStrongly):
+        (JSC::CodeBlock::shouldJettisonDueToOldAge):
+        (JSC::CodeBlock::propagateTransitions):
+        (JSC::CodeBlock::determineLiveness):
+        (JSC::CodeBlock::WeakReferenceHarvester::visitWeakReferences):
+        (JSC::CodeBlock::UnconditionalFinalizer::finalizeUnconditionally):
+        (JSC::CodeBlock::visitOSRExitTargets):
+        (JSC::CodeBlock::stronglyVisitStrongReferences):
+        (JSC::CodeBlock::stronglyVisitWeakReferences):
+        * bytecode/CodeBlock.h:
+        (JSC::CodeBlock::clearVisitWeaklyHasBeenCalled):
+        * heap/CodeBlockSet.cpp:
+        (JSC::CodeBlockSet::deleteUnmarkedAndUnreferenced):
+        * heap/Heap.cpp:
+        (JSC::Heap::ResumeTheWorldScope::ResumeTheWorldScope):
+        (JSC::Heap::markToFixpoint):
+        (JSC::Heap::beginMarking):
+        (JSC::Heap::addToRememberedSet):
+        (JSC::Heap::collectInThread):
+        * heap/Heap.h:
+        * heap/HeapInlines.h:
+        (JSC::Heap::mutatorFence):
+        * heap/MarkedBlock.cpp:
+        * runtime/JSCellInlines.h:
+        (JSC::JSCell::finishCreation):
+        * runtime/JSObjectInlines.h:
+        (JSC::JSObject::putDirectWithoutTransition):
+        (JSC::JSObject::putDirectInternal):
+        * runtime/Options.h:
+        * runtime/Structure.cpp:
+        (JSC::Structure::add):
+        * runtime/Structure.h:
+        * runtime/StructureInlines.h:
+        (JSC::Structure::add):
+
 2016-11-18  Joseph Pecoraro  <[email protected]>
 
         Web Inspector: Generator functions should have a displayable name when shown in stack traces

Modified: trunk/Source/_javascript_Core/bytecode/CodeBlock.cpp (208896 => 208897)


--- trunk/Source/_javascript_Core/bytecode/CodeBlock.cpp	2016-11-18 22:04:43 UTC (rev 208896)
+++ trunk/Source/_javascript_Core/bytecode/CodeBlock.cpp	2016-11-18 22:11:51 UTC (rev 208897)
@@ -1873,7 +1873,7 @@
     , m_reoptimizationRetryCounter(0)
     , m_creationTime(std::chrono::steady_clock::now())
 {
-    m_visitWeaklyHasBeenCalled.store(false, std::memory_order_relaxed);
+    m_visitWeaklyHasBeenCalled = false;
 
     ASSERT(heap()->isDeferred());
     ASSERT(m_scopeRegister.isLocal());
@@ -1932,7 +1932,7 @@
     , m_reoptimizationRetryCounter(0)
     , m_creationTime(std::chrono::steady_clock::now())
 {
-    m_visitWeaklyHasBeenCalled.store(false, std::memory_order_relaxed);
+    m_visitWeaklyHasBeenCalled = false;
 
     ASSERT(heap()->isDeferred());
     ASSERT(m_scopeRegister.isLocal());
@@ -2508,18 +2508,20 @@
 
 void CodeBlock::visitWeakly(SlotVisitor& visitor)
 {
-    bool setByMe = !m_visitWeaklyHasBeenCalled.compareExchangeStrong(false, true);
-    if (!setByMe)
+    ConcurrentJSLocker locker(m_lock);
+    if (m_visitWeaklyHasBeenCalled)
         return;
+    
+    m_visitWeaklyHasBeenCalled = true;
 
     if (Heap::isMarkedConcurrently(this))
         return;
 
-    if (shouldVisitStrongly()) {
+    if (shouldVisitStrongly(locker)) {
         visitor.appendUnbarrieredReadOnlyPointer(this);
         return;
     }
-
+    
     // There are two things that may use unconditional finalizers: inline cache clearing
     // and jettisoning. The probability of us wanting to do at least one of those things
     // is probably quite close to 1. So we add one no matter what and when it runs, it
@@ -2549,10 +2551,10 @@
     // decision by calling harvestWeakReferences().
 
     m_allTransitionsHaveBeenMarked = false;
-    propagateTransitions(visitor);
+    propagateTransitions(locker, visitor);
 
     m_jitCode->dfgCommon()->livenessHasBeenProved = false;
-    determineLiveness(visitor);
+    determineLiveness(locker, visitor);
 #endif // ENABLE(DFG_JIT)
 }
 
@@ -2575,6 +2577,7 @@
 
 void CodeBlock::visitChildren(SlotVisitor& visitor)
 {
+    ConcurrentJSLocker locker(m_lock);
     // There are two things that may use unconditional finalizers: inline cache clearing
     // and jettisoning. The probability of us wanting to do at least one of those things
     // is probably quite close to 1. So we add one no matter what and when it runs, it
@@ -2592,19 +2595,19 @@
         visitor.reportExtraMemoryVisited(m_instructions.size() * sizeof(Instruction) / refCount);
     }
 
-    stronglyVisitStrongReferences(visitor);
-    stronglyVisitWeakReferences(visitor);
+    stronglyVisitStrongReferences(locker, visitor);
+    stronglyVisitWeakReferences(locker, visitor);
 
     m_allTransitionsHaveBeenMarked = false;
-    propagateTransitions(visitor);
+    propagateTransitions(locker, visitor);
 }
 
-bool CodeBlock::shouldVisitStrongly()
+bool CodeBlock::shouldVisitStrongly(const ConcurrentJSLocker& locker)
 {
     if (Options::forceCodeBlockLiveness())
         return true;
 
-    if (shouldJettisonDueToOldAge())
+    if (shouldJettisonDueToOldAge(locker))
         return false;
 
     // Interpreter and Baseline JIT CodeBlocks don't need to be jettisoned when
@@ -2656,7 +2659,7 @@
     }
 }
 
-bool CodeBlock::shouldJettisonDueToOldAge()
+bool CodeBlock::shouldJettisonDueToOldAge(const ConcurrentJSLocker&)
 {
     if (Heap::isMarkedConcurrently(this))
         return false;
@@ -2683,7 +2686,7 @@
 }
 #endif // ENABLE(DFG_JIT)
 
-void CodeBlock::propagateTransitions(SlotVisitor& visitor)
+void CodeBlock::propagateTransitions(const ConcurrentJSLocker&, SlotVisitor& visitor)
 {
     UNUSED_PARAM(visitor);
 
@@ -2764,7 +2767,7 @@
         m_allTransitionsHaveBeenMarked = true;
 }
 
-void CodeBlock::determineLiveness(SlotVisitor& visitor)
+void CodeBlock::determineLiveness(const ConcurrentJSLocker&, SlotVisitor& visitor)
 {
     UNUSED_PARAM(visitor);
     
@@ -2810,9 +2813,9 @@
     CodeBlock* codeBlock =
         bitwise_cast<CodeBlock*>(
             bitwise_cast<char*>(this) - OBJECT_OFFSETOF(CodeBlock, m_weakReferenceHarvester));
-
-    codeBlock->propagateTransitions(visitor);
-    codeBlock->determineLiveness(visitor);
+    
+    codeBlock->propagateTransitions(NoLockingNecessary, visitor);
+    codeBlock->determineLiveness(NoLockingNecessary, visitor);
 }
 
 void CodeBlock::clearLLIntGetByIdCache(Instruction* instruction)
@@ -2951,7 +2954,9 @@
 {
     CodeBlock* codeBlock = bitwise_cast<CodeBlock*>(
         bitwise_cast<char*>(this) - OBJECT_OFFSETOF(CodeBlock, m_unconditionalFinalizer));
-
+    
+    codeBlock->updateAllPredictions();
+    
 #if ENABLE(DFG_JIT)
     if (codeBlock->shouldJettisonDueToWeakReference()) {
         codeBlock->jettison(Profiler::JettisonDueToWeakReference);
@@ -2959,7 +2964,7 @@
     }
 #endif // ENABLE(DFG_JIT)
 
-    if (codeBlock->shouldJettisonDueToOldAge()) {
+    if (codeBlock->shouldJettisonDueToOldAge(NoLockingNecessary)) {
         codeBlock->jettison(Profiler::JettisonDueToOldAge);
         return;
     }
@@ -3099,7 +3104,7 @@
 }
 #endif
 
-void CodeBlock::visitOSRExitTargets(SlotVisitor& visitor)
+void CodeBlock::visitOSRExitTargets(const ConcurrentJSLocker&, SlotVisitor& visitor)
 {
     // We strongly visit OSR exits targets because we don't want to deal with
     // the complexity of generating an exit target CodeBlock on demand and
@@ -3119,7 +3124,7 @@
 #endif
 }
 
-void CodeBlock::stronglyVisitStrongReferences(SlotVisitor& visitor)
+void CodeBlock::stronglyVisitStrongReferences(const ConcurrentJSLocker& locker, SlotVisitor& visitor)
 {
     visitor.append(&m_globalObject);
     visitor.append(&m_ownerExecutable);
@@ -3141,13 +3146,11 @@
 
 #if ENABLE(DFG_JIT)
     if (JITCode::isOptimizingJIT(jitType()))
-        visitOSRExitTargets(visitor);
+        visitOSRExitTargets(locker, visitor);
 #endif
-
-    updateAllPredictions();
 }
 
-void CodeBlock::stronglyVisitWeakReferences(SlotVisitor& visitor)
+void CodeBlock::stronglyVisitWeakReferences(const ConcurrentJSLocker&, SlotVisitor& visitor)
 {
     UNUSED_PARAM(visitor);
 

Modified: trunk/Source/_javascript_Core/bytecode/CodeBlock.h (208896 => 208897)


--- trunk/Source/_javascript_Core/bytecode/CodeBlock.h	2016-11-18 22:04:43 UTC (rev 208896)
+++ trunk/Source/_javascript_Core/bytecode/CodeBlock.h	2016-11-18 22:11:51 UTC (rev 208897)
@@ -836,7 +836,7 @@
     // concurrent compilation threads finish what they're doing.
     mutable ConcurrentJSLock m_lock;
 
-    Atomic<bool> m_visitWeaklyHasBeenCalled;
+    bool m_visitWeaklyHasBeenCalled;
 
     bool m_shouldAlwaysBeInlined; // Not a bitfield because the JIT wants to store to it.
 
@@ -947,16 +947,16 @@
     void dumpRareCaseProfile(PrintStream&, const char* name, RareCaseProfile*, bool& hasPrintedProfiling);
     void dumpArithProfile(PrintStream&, ArithProfile*, bool& hasPrintedProfiling);
 
-    bool shouldVisitStrongly();
+    bool shouldVisitStrongly(const ConcurrentJSLocker&);
     bool shouldJettisonDueToWeakReference();
-    bool shouldJettisonDueToOldAge();
+    bool shouldJettisonDueToOldAge(const ConcurrentJSLocker&);
     
-    void propagateTransitions(SlotVisitor&);
-    void determineLiveness(SlotVisitor&);
+    void propagateTransitions(const ConcurrentJSLocker&, SlotVisitor&);
+    void determineLiveness(const ConcurrentJSLocker&, SlotVisitor&);
         
-    void stronglyVisitStrongReferences(SlotVisitor&);
-    void stronglyVisitWeakReferences(SlotVisitor&);
-    void visitOSRExitTargets(SlotVisitor&);
+    void stronglyVisitStrongReferences(const ConcurrentJSLocker&, SlotVisitor&);
+    void stronglyVisitWeakReferences(const ConcurrentJSLocker&, SlotVisitor&);
+    void visitOSRExitTargets(const ConcurrentJSLocker&, SlotVisitor&);
 
     std::chrono::milliseconds timeSinceCreation()
     {
@@ -1079,7 +1079,7 @@
 
 inline void CodeBlock::clearVisitWeaklyHasBeenCalled()
 {
-    m_visitWeaklyHasBeenCalled.store(false, std::memory_order_relaxed);
+    m_visitWeaklyHasBeenCalled = false;
 }
 
 template <typename ExecutableType>

Modified: trunk/Source/_javascript_Core/heap/CodeBlockSet.cpp (208896 => 208897)


--- trunk/Source/_javascript_Core/heap/CodeBlockSet.cpp	2016-11-18 22:04:43 UTC (rev 208896)
+++ trunk/Source/_javascript_Core/heap/CodeBlockSet.cpp	2016-11-18 22:11:51 UTC (rev 208897)
@@ -94,8 +94,7 @@
     }
 
     // Any remaining young CodeBlocks are live and need to be promoted to the set of old CodeBlocks.
-    if (scope == CollectionScope::Eden)
-        promoteYoungCodeBlocks(locker);
+    promoteYoungCodeBlocks(locker);
 }
 
 bool CodeBlockSet::contains(const LockHolder&, void* candidateCodeBlock)

Modified: trunk/Source/_javascript_Core/heap/Heap.cpp (208896 => 208897)


--- trunk/Source/_javascript_Core/heap/Heap.cpp	2016-11-18 22:04:43 UTC (rev 208896)
+++ trunk/Source/_javascript_Core/heap/Heap.cpp	2016-11-18 22:11:51 UTC (rev 208897)
@@ -98,7 +98,7 @@
         
         if (Options::logGC()) {
             double thisPauseMS = (MonotonicTime::now() - m_heap.m_stopTime).milliseconds();
-            dataLog(thisPauseMS, " ms (max ", maxPauseMS(thisPauseMS), ")...]\n");
+            dataLog("p=", thisPauseMS, " ms (max ", maxPauseMS(thisPauseMS), ")...]\n");
         }
         
         m_heap.resumeTheWorld();
@@ -542,21 +542,42 @@
     const Seconds period = Seconds::fromMilliseconds(Options::concurrentGCPeriodMS());
     
     const double bytesAllocatedThisCycleAtTheBeginning = m_bytesAllocatedThisCycle;
-    const double bytesAllocatedThisCycleAtTheEnd = bytesAllocatedThisCycleAtTheBeginning * Options::concurrentGCHeadroomRatio();
+    const double bytesAllocatedThisCycleAtTheEnd =
+        Options::concurrentGCMaxHeadroom() *
+        std::max(
+            bytesAllocatedThisCycleAtTheBeginning,
+            static_cast<double>(m_maxEdenSize));
     
-    auto targetCollectorUtilization = [&] () -> double {
+    auto targetMutatorUtilization = [&] () -> double {
         double headroomFullness =
             (m_bytesAllocatedThisCycle - bytesAllocatedThisCycleAtTheBeginning) /
             (bytesAllocatedThisCycleAtTheEnd - bytesAllocatedThisCycleAtTheBeginning);
         
+        // headroomFullness can be NaN and other interesting things if
+        // bytesAllocatedThisCycleAtTheBeginning is zero. We see that in debug tests. This code
+        // defends against all floating point dragons.
+        
         if (!(headroomFullness >= 0))
             headroomFullness = 0;
         if (!(headroomFullness <= 1))
             headroomFullness = 1;
         
-        return headroomFullness;
+        double mutatorUtilization = 1 - headroomFullness;
+        
+        // Scale the mutator utilization into the permitted window.
+        mutatorUtilization =
+            Options::minimumMutatorUtilization() +
+            mutatorUtilization * (
+                Options::maximumMutatorUtilization() -
+                Options::minimumMutatorUtilization());
+        
+        return mutatorUtilization;
     };
     
+    auto targetCollectorUtilization = [&] () -> double {
+        return 1 - targetMutatorUtilization();
+    };
+    
     auto elapsedInPeriod = [&] (MonotonicTime now) -> Seconds {
         return (now - initialTime) % period;
     };
@@ -566,22 +587,41 @@
     };
     
     auto shouldBeResumed = [&] (MonotonicTime now) -> bool {
+        if (Options::collectorShouldResumeFirst())
+            return phase(now) <= targetMutatorUtilization();
         return phase(now) > targetCollectorUtilization();
     };
     
     auto timeToResume = [&] (MonotonicTime now) -> MonotonicTime {
         ASSERT(!shouldBeResumed(now));
+        if (Options::collectorShouldResumeFirst())
+            return now - elapsedInPeriod(now) + period;
         return now - elapsedInPeriod(now) + period * targetCollectorUtilization();
     };
     
     auto timeToStop = [&] (MonotonicTime now) -> MonotonicTime {
         ASSERT(shouldBeResumed(now));
-        return now - elapsedInPeriod(now) + period;
+        if (Options::collectorShouldResumeFirst())
+            return now - elapsedInPeriod(now) + period * targetMutatorUtilization();
+        return now -  - elapsedInPeriod(now) + period;
     };
     
+    // Adjust the target extra pause ratio as necessary.
+    double rateOfCollection =
+        (m_lastGCEndTime - m_lastGCStartTime) /
+        (m_currentGCStartTime - m_lastGCStartTime);
+    
+    if (Options::logGC())
+        dataLog("cr=", rateOfCollection, " ");
+    
+    // FIXME: Determine if this is useful or get rid of it.
+    // https://bugs.webkit.org/show_bug.cgi?id=164940
+    double extraPauseRatio = Options::initialExtraPauseRatio();
+    
     for (unsigned iteration = 1; ; ++iteration) {
         if (Options::logGC())
             dataLog("i#", iteration, " ");
+        MonotonicTime topOfLoop = MonotonicTime::now();
         {
             TimingScope preConvergenceTimingScope(*this, "Heap::markToFixpoint conservative scan");
             ConservativeRoots conservativeRoots(*this);
@@ -642,6 +682,9 @@
         DFG::markCodeBlocks(*m_vm, *m_collectorSlotVisitor);
         bool shouldTerminate = m_collectorSlotVisitor->isEmpty() && m_mutatorMarkStack->isEmpty();
         
+        if (Options::logGC())
+            dataLog(m_collectorSlotVisitor->collectorMarkStack().size(), "+", m_mutatorMarkStack->size() + m_collectorSlotVisitor->mutatorMarkStack().size(), ", a=", m_bytesAllocatedThisCycle / 1024, " kb, b=", m_barriersExecuted, ", mu=", targetMutatorUtilization(), " ");
+        
         // We want to do this to conservatively ensure that we rescan any code blocks that are
         // running right now. However, we need to be sure to do it *after* we mark the code block
         // so that we know for sure if it really needs a barrier. Also, this has to happen after the
@@ -661,23 +704,30 @@
         if (Options::logGC() == GCLogging::Verbose)
             dataLog("Live Weak Handles:\n", *m_collectorSlotVisitor);
         
-        if (Options::logGC())
-            dataLog(m_collectorSlotVisitor->collectorMarkStack().size(), "+", m_collectorSlotVisitor->mutatorMarkStack().size(), " mu=", 1 - targetCollectorUtilization(), " ");
+        MonotonicTime beforeConvergence = MonotonicTime::now();
         
         {
             TimingScope traceTimingScope(*this, "Heap::markToFixpoint tracing");
             ParallelModeEnabler enabler(*m_collectorSlotVisitor);
+            
             if (Options::useCollectorTimeslicing()) {
-                for (;;) {
+                // Before we yield to the mutator, we should do GC work proportional to the time we
+                // spent paused. We initialize the timeslicer to start after this "mandatory" pause
+                // completes.
+                
+                SlotVisitor::SharedDrainResult drainResult;
+                
+                Seconds extraPause = (beforeConvergence - topOfLoop) * extraPauseRatio;
+                initialTime = beforeConvergence + extraPause;
+                drainResult = m_collectorSlotVisitor->drainInParallel(initialTime);
+                
+                while (drainResult != SlotVisitor::SharedDrainResult::Done) {
                     MonotonicTime now = MonotonicTime::now();
-                    SlotVisitor::SharedDrainResult drainResult;
                     if (shouldBeResumed(now)) {
                         ResumeTheWorldScope resumeTheWorldScope(*this);
                         drainResult = m_collectorSlotVisitor->drainInParallel(timeToStop(now));
                     } else
                         drainResult = m_collectorSlotVisitor->drainInParallel(timeToResume(now));
-                    if (drainResult == SlotVisitor::SharedDrainResult::Done)
-                        break;
                 }
             } else {
                 // Disabling collector timeslicing is meant to be used together with
@@ -686,6 +736,8 @@
                 m_collectorSlotVisitor->drainInParallel();
             }
         }
+        
+        extraPauseRatio *= Options::extraPauseRatioIterationGrowthRate();
     }
 
     {
@@ -730,6 +782,7 @@
     m_objectSpace.beginMarking();
     m_mutatorShouldBeFenced = true;
     m_barrierThreshold = tautologicalThreshold;
+    m_barriersExecuted = 0;
 }
 
 void Heap::visitConservativeRoots(ConservativeRoots& roots)
@@ -986,6 +1039,7 @@
 {
     ASSERT(cell);
     ASSERT(!Options::useConcurrentJIT() || !isCompilationThread());
+    m_barriersExecuted++;
     if (!Heap::isMarkedConcurrently(cell)) {
         // During a full collection a store into an unmarked object that had surivived past
         // collections will manifest as a store to an unmarked black object. If the object gets
@@ -1082,6 +1136,8 @@
 
 void Heap::collectInThread()
 {
+    m_currentGCStartTime = MonotonicTime::now();
+    
     Optional<CollectionScope> scope;
     {
         LockHolder locker(*m_threadLock);
@@ -1166,7 +1222,7 @@
     if (Options::logGC()) {
         MonotonicTime after = MonotonicTime::now();
         double thisPauseMS = (after - m_stopTime).milliseconds();
-        dataLog(thisPauseMS, " ms (max ", maxPauseMS(thisPauseMS), "), cycle ", (after - before).milliseconds(), " ms END]\n");
+        dataLog("p=", thisPauseMS, " ms (max ", maxPauseMS(thisPauseMS), "), cycle ", (after - before).milliseconds(), " ms END]\n");
     }
     
     {
@@ -1179,6 +1235,9 @@
 
     setNeedFinalize();
     resumeTheWorld();
+    
+    m_lastGCStartTime = m_currentGCStartTime;
+    m_lastGCEndTime = MonotonicTime::now();
 }
 
 void Heap::stopTheWorld()
@@ -1698,7 +1757,7 @@
     m_bytesAllocatedThisCycle = 0;
 
     if (Options::logGC())
-        dataLog(currentHeapSize / 1024, " kb, ");
+        dataLog("=> ", currentHeapSize / 1024, " kb, ");
 }
 
 void Heap::didFinishCollection(double gcStartTime)

Modified: trunk/Source/_javascript_Core/heap/Heap.h (208896 => 208897)


--- trunk/Source/_javascript_Core/heap/Heap.h	2016-11-18 22:04:43 UTC (rev 208896)
+++ trunk/Source/_javascript_Core/heap/Heap.h	2016-11-18 22:11:51 UTC (rev 208897)
@@ -116,6 +116,8 @@
     
     void writeBarrierWithoutFence(const JSCell* from);
     
+    void mutatorFence();
+    
     // Take this if you know that from->cellState() < barrierThreshold.
     JS_EXPORT_PRIVATE void writeBarrierSlowPath(const JSCell* from);
 
@@ -605,6 +607,12 @@
     Box<Lock> m_threadLock;
     RefPtr<AutomaticThreadCondition> m_threadCondition; // The mutator must not wait on this. It would cause a deadlock.
     RefPtr<AutomaticThread> m_thread;
+    
+    MonotonicTime m_lastGCStartTime;
+    MonotonicTime m_lastGCEndTime;
+    MonotonicTime m_currentGCStartTime;
+    
+    uintptr_t m_barriersExecuted { 0 };
 };
 
 } // namespace JSC

Modified: trunk/Source/_javascript_Core/heap/HeapInlines.h (208896 => 208897)


--- trunk/Source/_javascript_Core/heap/HeapInlines.h	2016-11-18 22:04:43 UTC (rev 208896)
+++ trunk/Source/_javascript_Core/heap/HeapInlines.h	2016-11-18 22:11:51 UTC (rev 208897)
@@ -157,6 +157,12 @@
         addToRememberedSet(from);
 }
 
+inline void Heap::mutatorFence()
+{
+    if (isX86() || UNLIKELY(mutatorShouldBeFenced()))
+        WTF::storeStoreFence();
+}
+
 template<typename Functor> inline void Heap::forEachCodeBlock(const Functor& func)
 {
     forEachCodeBlockImpl(scopedLambdaRef<bool(CodeBlock*)>(func));

Modified: trunk/Source/_javascript_Core/heap/MarkedBlock.cpp (208896 => 208897)


--- trunk/Source/_javascript_Core/heap/MarkedBlock.cpp	2016-11-18 22:04:43 UTC (rev 208896)
+++ trunk/Source/_javascript_Core/heap/MarkedBlock.cpp	2016-11-18 22:11:51 UTC (rev 208897)
@@ -34,6 +34,8 @@
 
 namespace JSC {
 
+const size_t MarkedBlock::blockSize;
+
 static const bool computeBalance = false;
 static size_t balance;
 

Modified: trunk/Source/_javascript_Core/runtime/JSCellInlines.h (208896 => 208897)


--- trunk/Source/_javascript_Core/runtime/JSCellInlines.h	2016-11-18 22:04:43 UTC (rev 208896)
+++ trunk/Source/_javascript_Core/runtime/JSCellInlines.h	2016-11-18 22:11:51 UTC (rev 208897)
@@ -60,8 +60,7 @@
 {
     // This object is ready to be escaped so the concurrent GC may see it at any time. We have
     // to make sure that none of our stores sink below here.
-    if (isX86() || UNLIKELY(vm.heap.mutatorShouldBeFenced()))
-        WTF::storeStoreFence();
+    vm.heap.mutatorFence();
 #if ENABLE(GC_VALIDATION)
     ASSERT(vm.isInitializingObject());
     vm.setInitializingObjectClass(0);

Modified: trunk/Source/_javascript_Core/runtime/JSObjectInlines.h (208896 => 208897)


--- trunk/Source/_javascript_Core/runtime/JSObjectInlines.h	2016-11-18 22:04:43 UTC (rev 208896)
+++ trunk/Source/_javascript_Core/runtime/JSObjectInlines.h	2016-11-18 22:11:51 UTC (rev 208897)
@@ -167,9 +167,9 @@
     unsigned oldOutOfLineCapacity = structure->outOfLineCapacity();
     structure->addPropertyWithoutTransition(
         vm, propertyName, attributes,
-        [&] (const GCSafeConcurrentJSLocker&, PropertyOffset offset) {
-            if (structure->outOfLineCapacity() != oldOutOfLineCapacity) {
-                butterfly = allocateMoreOutOfLineStorage(vm, oldOutOfLineCapacity, structure->outOfLineCapacity());
+        [&] (const GCSafeConcurrentJSLocker&, PropertyOffset offset, unsigned newOutOfLineCapacity) {
+            if (newOutOfLineCapacity != oldOutOfLineCapacity) {
+                butterfly = allocateMoreOutOfLineStorage(vm, oldOutOfLineCapacity, newOutOfLineCapacity);
                 WTF::storeStoreFence();
                 setButterfly(vm, butterfly);
             }
@@ -266,15 +266,14 @@
         unsigned oldOutOfLineCapacity = structure->outOfLineCapacity();
         offset = structure->addPropertyWithoutTransition(
             vm, propertyName, attributes,
-            [&] (const GCSafeConcurrentJSLocker&, PropertyOffset offset) {
+            [&] (const GCSafeConcurrentJSLocker&, PropertyOffset offset, unsigned newOutOfLineCapacity) {
                 Butterfly* butterfly = this->butterfly();
-                if (structure->outOfLineCapacity() != oldOutOfLineCapacity) {
-                    butterfly = allocateMoreOutOfLineStorage(vm, oldOutOfLineCapacity, structure->outOfLineCapacity());
+                if (newOutOfLineCapacity != oldOutOfLineCapacity) {
+                    butterfly = allocateMoreOutOfLineStorage(vm, oldOutOfLineCapacity, newOutOfLineCapacity);
                     WTF::storeStoreFence();
                     setButterfly(vm, butterfly);
                 }
                 validateOffset(offset);
-                ASSERT(structure->isValidOffset(offset));
                 putDirect(vm, offset, value);
             });
 

Modified: trunk/Source/_javascript_Core/runtime/Options.h (208896 => 208897)


--- trunk/Source/_javascript_Core/runtime/Options.h	2016-11-18 22:04:43 UTC (rev 208896)
+++ trunk/Source/_javascript_Core/runtime/Options.h	2016-11-18 22:11:51 UTC (rev 208897)
@@ -195,8 +195,13 @@
     v(double, mediumHeapGrowthFactor, 1.5, Normal, nullptr) \
     v(double, largeHeapGrowthFactor, 1.24, Normal, nullptr) \
     v(bool, useCollectorTimeslicing, true, Normal, nullptr) \
-    v(double, concurrentGCHeadroomRatio, 1.5, Normal, nullptr) \
+    v(double, minimumMutatorUtilization, 0, Normal, nullptr) \
+    v(double, maximumMutatorUtilization, 0.7, Normal, nullptr) \
+    v(double, concurrentGCMaxHeadroom, 1.5, Normal, nullptr) \
     v(double, concurrentGCPeriodMS, 2, Normal, nullptr) \
+    v(double, initialExtraPauseRatio, 0, Normal, nullptr) \
+    v(double, extraPauseRatioIterationGrowthRate, 1.1, Normal, nullptr) \
+    v(bool, collectorShouldResumeFirst, false, Normal, nullptr) \
     v(bool, scribbleFreeCells, false, Normal, nullptr) \
     v(double, sizeClassProgression, 1.4, Normal, nullptr) \
     v(unsigned, largeAllocationCutoff, 100000, Normal, nullptr) \

Modified: trunk/Source/_javascript_Core/runtime/Structure.cpp (208896 => 208897)


--- trunk/Source/_javascript_Core/runtime/Structure.cpp	2016-11-18 22:04:43 UTC (rev 208896)
+++ trunk/Source/_javascript_Core/runtime/Structure.cpp	2016-11-18 22:11:51 UTC (rev 208897)
@@ -943,7 +943,7 @@
 
 PropertyOffset Structure::add(VM& vm, PropertyName propertyName, unsigned attributes)
 {
-    return add(vm, propertyName, attributes, [] (const GCSafeConcurrentJSLocker&, PropertyOffset) { });
+    return add(vm, propertyName, attributes, [] (const GCSafeConcurrentJSLocker&, PropertyOffset, unsigned) { });
 }
 
 PropertyOffset Structure::remove(PropertyName propertyName)

Modified: trunk/Source/_javascript_Core/runtime/Structure.h (208896 => 208897)


--- trunk/Source/_javascript_Core/runtime/Structure.h	2016-11-18 22:04:43 UTC (rev 208896)
+++ trunk/Source/_javascript_Core/runtime/Structure.h	2016-11-18 22:11:51 UTC (rev 208897)
@@ -294,26 +294,11 @@
 
     unsigned outOfLineCapacity() const
     {
-        ASSERT(checkOffsetConsistency());
-            
-        unsigned outOfLineSize = this->outOfLineSize();
-
-        if (!outOfLineSize)
-            return 0;
-
-        if (outOfLineSize <= initialOutOfLineCapacity)
-            return initialOutOfLineCapacity;
-
-        ASSERT(outOfLineSize > initialOutOfLineCapacity);
-        COMPILE_ASSERT(outOfLineGrowthFactor == 2, outOfLineGrowthFactor_is_two);
-        return WTF::roundUpToPowerOfTwo(outOfLineSize);
+        return outOfLineCapacity(m_offset);
     }
     unsigned outOfLineSize() const
     {
-        ASSERT(checkOffsetConsistency());
-        ASSERT(structure()->classInfo() == info());
-            
-        return numberOfOutOfLineSlotsForLastOffset(m_offset);
+        return outOfLineSize(m_offset);
     }
     bool hasInlineStorage() const
     {
@@ -636,7 +621,34 @@
     void findStructuresAndMapForMaterialization(Vector<Structure*, 8>& structures, Structure*&, PropertyTable*&);
     
     static Structure* toDictionaryTransition(VM&, Structure*, DictionaryKind, DeferredStructureTransitionWatchpointFire* = nullptr);
+    
+    unsigned outOfLineCapacity(PropertyOffset lastOffset) const
+    {
+        unsigned outOfLineSize = this->outOfLineSize(lastOffset);
 
+        // This algorithm completely determines the out-of-line property storage growth algorithm.
+        // The JSObject code will only trigger a resize if the value returned by this algorithm
+        // changed between the new and old structure. So, it's important to keep this simple because
+        // it's on a fast path.
+        
+        if (!outOfLineSize)
+            return 0;
+
+        if (outOfLineSize <= initialOutOfLineCapacity)
+            return initialOutOfLineCapacity;
+
+        ASSERT(outOfLineSize > initialOutOfLineCapacity);
+        COMPILE_ASSERT(outOfLineGrowthFactor == 2, outOfLineGrowthFactor_is_two);
+        return WTF::roundUpToPowerOfTwo(outOfLineSize);
+    }
+    
+    unsigned outOfLineSize(PropertyOffset lastOffset) const
+    {
+        ASSERT(structure()->classInfo() == info());
+            
+        return numberOfOutOfLineSlotsForLastOffset(lastOffset);
+    }
+
     template<typename Func>
     PropertyOffset add(VM&, PropertyName, unsigned attributes, const Func&);
     PropertyOffset add(VM&, PropertyName, unsigned attributes);

Modified: trunk/Source/_javascript_Core/runtime/StructureInlines.h (208896 => 208897)


--- trunk/Source/_javascript_Core/runtime/StructureInlines.h	2016-11-18 22:04:43 UTC (rev 208896)
+++ trunk/Source/_javascript_Core/runtime/StructureInlines.h	2016-11-18 22:11:51 UTC (rev 208897)
@@ -304,11 +304,14 @@
 
     PropertyOffset newOffset = table->nextOffset(m_inlineCapacity);
     
-    table->add(PropertyMapEntry(rep, newOffset, attributes), m_offset, PropertyTable::PropertyOffsetMayChange);
+    PropertyOffset newLastOffset = m_offset;
+    table->add(PropertyMapEntry(rep, newOffset, attributes), newLastOffset, PropertyTable::PropertyOffsetMayChange);
     
+    func(locker, newOffset, outOfLineCapacity(newLastOffset));
+    vm.heap.mutatorFence();
+    m_offset = newLastOffset;
+
     checkConsistency();
-
-    func(locker, newOffset);
     return newOffset;
 }
 
_______________________________________________
webkit-changes mailing list
[email protected]
https://lists.webkit.org/mailman/listinfo/webkit-changes

Reply via email to