Title: [210971] trunk/Source/_javascript_Core
Revision
210971
Author
[email protected]
Date
2017-01-20 10:10:55 -0800 (Fri, 20 Jan 2017)

Log Message

We should flash a safepoint before each DFG/FTL phase
https://bugs.webkit.org/show_bug.cgi?id=167234

Reviewed by Filip Pizlo.

The recent GC changes caused us to regress Kraken because of a
longstanding issue that happened to be hit with higher frequency because
of a change in timing between when a particular GC was happening and
when a particular FTL compilation was happening. The regression is caused
by the GC was waiting for a large function to make it through the DFG portion
of an FTL compilation. This was taking 20ms-30ms and started happened during a
particular test with much higher frequency.

This means that anytime the GC waits for this compilation, the test ran at least
~20ms slower because the GC waits for the compiler threads the mutator is stopped.

It's good that we have such an easily reproducible case of this performance
issue because it will effect many real JS programs, especially ones with
large functions that get hot.

The most straight forward solution to fix this is to flash a safepoint before
each phase, allowing the GC to suspend the compiler if needed. In my testing,
this progresses Kraken in the browser, and doesn't regress anything else. This
solution also makes the most sense. I did some analysis on the compilation time
of this function that took ~20-30ms to pass through the DFG phases, and
the phase times were mostly evenly distributed. Some took longer than others,
but no phase was longer than 3ms. Most were in the 0.25ms to 1.5ms range.

* dfg/DFGPlan.cpp:
(JSC::DFG::Plan::compileInThreadImpl):
* dfg/DFGSafepoint.cpp:
(JSC::DFG::Safepoint::begin):
* runtime/Options.h:

Modified Paths

Diff

Modified: trunk/Source/_javascript_Core/ChangeLog (210970 => 210971)


--- trunk/Source/_javascript_Core/ChangeLog	2017-01-20 18:09:30 UTC (rev 210970)
+++ trunk/Source/_javascript_Core/ChangeLog	2017-01-20 18:10:55 UTC (rev 210971)
@@ -1,3 +1,39 @@
+2017-01-20  Saam Barati  <[email protected]>
+
+        We should flash a safepoint before each DFG/FTL phase
+        https://bugs.webkit.org/show_bug.cgi?id=167234
+
+        Reviewed by Filip Pizlo.
+
+        The recent GC changes caused us to regress Kraken because of a
+        longstanding issue that happened to be hit with higher frequency because
+        of a change in timing between when a particular GC was happening and 
+        when a particular FTL compilation was happening. The regression is caused
+        by the GC was waiting for a large function to make it through the DFG portion
+        of an FTL compilation. This was taking 20ms-30ms and started happened during a
+        particular test with much higher frequency.
+        
+        This means that anytime the GC waits for this compilation, the test ran at least
+        ~20ms slower because the GC waits for the compiler threads the mutator is stopped.
+        
+        It's good that we have such an easily reproducible case of this performance
+        issue because it will effect many real JS programs, especially ones with
+        large functions that get hot.
+        
+        The most straight forward solution to fix this is to flash a safepoint before
+        each phase, allowing the GC to suspend the compiler if needed. In my testing,
+        this progresses Kraken in the browser, and doesn't regress anything else. This
+        solution also makes the most sense. I did some analysis on the compilation time
+        of this function that took ~20-30ms to pass through the DFG phases, and
+        the phase times were mostly evenly distributed. Some took longer than others,
+        but no phase was longer than 3ms. Most were in the 0.25ms to 1.5ms range.
+
+        * dfg/DFGPlan.cpp:
+        (JSC::DFG::Plan::compileInThreadImpl):
+        * dfg/DFGSafepoint.cpp:
+        (JSC::DFG::Safepoint::begin):
+        * runtime/Options.h:
+
 2017-01-20  Skachkov Oleksandr  <[email protected]>
 
         Super property access in base class constructor doesn't work

Modified: trunk/Source/_javascript_Core/dfg/DFGPlan.cpp (210970 => 210971)


--- trunk/Source/_javascript_Core/dfg/DFGPlan.cpp	2017-01-20 18:09:30 UTC (rev 210970)
+++ trunk/Source/_javascript_Core/dfg/DFGPlan.cpp	2017-01-20 18:10:55 UTC (rev 210971)
@@ -254,6 +254,22 @@
     }
 
     codeBlock->setCalleeSaveRegisters(RegisterSet::dfgCalleeSaveRegisters());
+
+    bool changed = false;
+
+#define RUN_PHASE(phase)                                         \
+    do {                                                         \
+        if (Options::safepointBeforeEachPhase()) {               \
+            Safepoint::Result safepointResult;                   \
+            {                                                    \
+                GraphSafepoint safepoint(dfg, safepointResult);  \
+            }                                                    \
+            if (safepointResult.didGetCancelled())               \
+                return CancelPath;                               \
+        }                                                        \
+        changed |= phase(dfg);                                   \
+    } while (false);                                             \
+
     
     // By this point the DFG bytecode parser will have potentially mutated various tables
     // in the CodeBlock. This is a good time to perform an early shrink, which is more
@@ -269,16 +285,16 @@
         dfg.dump();
     }
 
-    performLiveCatchVariablePreservationPhase(dfg);
+    RUN_PHASE(performLiveCatchVariablePreservationPhase);
 
     if (Options::useMaximalFlushInsertionPhase())
-        performMaximalFlushInsertion(dfg);
+        RUN_PHASE(performMaximalFlushInsertion);
     
-    performCPSRethreading(dfg);
-    performUnification(dfg);
-    performPredictionInjection(dfg);
+    RUN_PHASE(performCPSRethreading);
+    RUN_PHASE(performUnification);
+    RUN_PHASE(performPredictionInjection);
     
-    performStaticExecutionCountEstimation(dfg);
+    RUN_PHASE(performStaticExecutionCountEstimation);
     
     if (mode == FTLForOSREntryMode) {
         bool result = performOSREntrypointCreation(dfg);
@@ -286,18 +302,18 @@
             finalizer = std::make_unique<FailedFinalizer>(*this);
             return FailPath;
         }
-        performCPSRethreading(dfg);
+        RUN_PHASE(performCPSRethreading);
     }
     
     if (validationEnabled())
         validate(dfg);
     
-    performBackwardsPropagation(dfg);
-    performPredictionPropagation(dfg);
-    performFixup(dfg);
-    performStructureRegistration(dfg);
-    performInvalidationPointInjection(dfg);
-    performTypeCheckHoisting(dfg);
+    RUN_PHASE(performBackwardsPropagation);
+    RUN_PHASE(performPredictionPropagation);
+    RUN_PHASE(performFixup);
+    RUN_PHASE(performStructureRegistration);
+    RUN_PHASE(performInvalidationPointInjection);
+    RUN_PHASE(performTypeCheckHoisting);
     
     dfg.m_fixpointState = FixpointNotConverged;
     
@@ -309,18 +325,18 @@
     if (validationEnabled())
         validate(dfg);
         
-    performStrengthReduction(dfg);
-    performCPSRethreading(dfg);
-    performCFA(dfg);
-    performConstantFolding(dfg);
-    bool changed = false;
-    changed |= performCFGSimplification(dfg);
-    changed |= performLocalCSE(dfg);
+    RUN_PHASE(performStrengthReduction);
+    RUN_PHASE(performCPSRethreading);
+    RUN_PHASE(performCFA);
+    RUN_PHASE(performConstantFolding);
+    changed = false;
+    RUN_PHASE(performCFGSimplification);
+    RUN_PHASE(performLocalCSE);
     
     if (validationEnabled())
         validate(dfg);
     
-    performCPSRethreading(dfg);
+    RUN_PHASE(performCPSRethreading);
     if (!isFTL(mode)) {
         // Only run this if we're not FTLing, because currently for a LoadVarargs that is forwardable and
         // in a non-varargs inlined call frame, this will generate ForwardVarargs while the FTL
@@ -341,11 +357,11 @@
         // ArgumentsEliminationPhase does everything that this phase does, and it doesn't introduce this
         // pathology.
         
-        changed |= performVarargsForwarding(dfg); // Do this after CFG simplification and CPS rethreading.
+        RUN_PHASE(performVarargsForwarding); // Do this after CFG simplification and CPS rethreading.
     }
     if (changed) {
-        performCFA(dfg);
-        performConstantFolding(dfg);
+        RUN_PHASE(performCFA);
+        RUN_PHASE(performConstantFolding);
     }
     
     // If we're doing validation, then run some analyses, to give them an opportunity
@@ -360,17 +376,17 @@
     case DFGMode: {
         dfg.m_fixpointState = FixpointConverged;
     
-        performTierUpCheckInjection(dfg);
+        RUN_PHASE(performTierUpCheckInjection);
 
-        performFastStoreBarrierInsertion(dfg);
-        performStoreBarrierClustering(dfg);
-        performCleanUp(dfg);
-        performCPSRethreading(dfg);
-        performDCE(dfg);
-        performPhantomInsertion(dfg);
-        performStackLayout(dfg);
-        performVirtualRegisterAllocation(dfg);
-        performWatchpointCollection(dfg);
+        RUN_PHASE(performFastStoreBarrierInsertion);
+        RUN_PHASE(performStoreBarrierClustering);
+        RUN_PHASE(performCleanUp);
+        RUN_PHASE(performCPSRethreading);
+        RUN_PHASE(performDCE);
+        RUN_PHASE(performPhantomInsertion);
+        RUN_PHASE(performStackLayout);
+        RUN_PHASE(performVirtualRegisterAllocation);
+        RUN_PHASE(performWatchpointCollection);
         dumpAndVerifyGraph(dfg, "Graph after optimization:");
         
         JITCompiler dataFlowJIT(dfg);
@@ -390,37 +406,37 @@
             return FailPath;
         }
         
-        performCleanUp(dfg); // Reduce the graph size a bit.
-        performCriticalEdgeBreaking(dfg);
+        RUN_PHASE(performCleanUp); // Reduce the graph size a bit.
+        RUN_PHASE(performCriticalEdgeBreaking);
         if (Options::createPreHeaders())
-            performLoopPreHeaderCreation(dfg);
-        performCPSRethreading(dfg);
-        performSSAConversion(dfg);
-        performSSALowering(dfg);
+            RUN_PHASE(performLoopPreHeaderCreation);
+        RUN_PHASE(performCPSRethreading);
+        RUN_PHASE(performSSAConversion);
+        RUN_PHASE(performSSALowering);
         
         // Ideally, these would be run to fixpoint with the object allocation sinking phase.
-        performArgumentsElimination(dfg);
+        RUN_PHASE(performArgumentsElimination);
         if (Options::usePutStackSinking())
-            performPutStackSinking(dfg);
+            RUN_PHASE(performPutStackSinking);
         
-        performConstantHoisting(dfg);
-        performGlobalCSE(dfg);
-        performLivenessAnalysis(dfg);
-        performCFA(dfg);
-        performConstantFolding(dfg);
-        performCleanUp(dfg); // Reduce the graph size a lot.
+        RUN_PHASE(performConstantHoisting);
+        RUN_PHASE(performGlobalCSE);
+        RUN_PHASE(performLivenessAnalysis);
+        RUN_PHASE(performCFA);
+        RUN_PHASE(performConstantFolding);
+        RUN_PHASE(performCleanUp); // Reduce the graph size a lot.
         changed = false;
-        changed |= performStrengthReduction(dfg);
+        RUN_PHASE(performStrengthReduction);
         if (Options::useObjectAllocationSinking()) {
-            changed |= performCriticalEdgeBreaking(dfg);
-            changed |= performObjectAllocationSinking(dfg);
+            RUN_PHASE(performCriticalEdgeBreaking);
+            RUN_PHASE(performObjectAllocationSinking);
         }
         if (changed) {
             // State-at-tail and state-at-head will be invalid if we did strength reduction since
             // it might increase live ranges.
-            performLivenessAnalysis(dfg);
-            performCFA(dfg);
-            performConstantFolding(dfg);
+            RUN_PHASE(performLivenessAnalysis);
+            RUN_PHASE(performCFA);
+            RUN_PHASE(performConstantFolding);
         }
         
         // Currently, this relies on pre-headers still being valid. That precludes running CFG
@@ -428,9 +444,9 @@
         // wrong with running LICM earlier, if we wanted to put other CFG transforms above this point.
         // Alternatively, we could run loop pre-header creation after SSA conversion - but if we did that
         // then we'd need to do some simple SSA fix-up.
-        performLivenessAnalysis(dfg);
-        performCFA(dfg);
-        performLICM(dfg);
+        RUN_PHASE(performLivenessAnalysis);
+        RUN_PHASE(performCFA);
+        RUN_PHASE(performLICM);
 
         // FIXME: Currently: IntegerRangeOptimization *must* be run after LICM.
         //
@@ -439,29 +455,29 @@
         // by IntegerRangeOptimization.
         //
         // Ideally, the dependencies should be explicit. See https://bugs.webkit.org/show_bug.cgi?id=157534.
-        performLivenessAnalysis(dfg);
-        performIntegerRangeOptimization(dfg);
+        RUN_PHASE(performLivenessAnalysis);
+        RUN_PHASE(performIntegerRangeOptimization);
         
-        performCleanUp(dfg);
-        performIntegerCheckCombining(dfg);
-        performGlobalCSE(dfg);
+        RUN_PHASE(performCleanUp);
+        RUN_PHASE(performIntegerCheckCombining);
+        RUN_PHASE(performGlobalCSE);
         
         // At this point we're not allowed to do any further code motion because our reasoning
         // about code motion assumes that it's OK to insert GC points in random places.
         dfg.m_fixpointState = FixpointConverged;
         
-        performLivenessAnalysis(dfg);
-        performCFA(dfg);
-        performGlobalStoreBarrierInsertion(dfg);
-        performStoreBarrierClustering(dfg);
+        RUN_PHASE(performLivenessAnalysis);
+        RUN_PHASE(performCFA);
+        RUN_PHASE(performGlobalStoreBarrierInsertion);
+        RUN_PHASE(performStoreBarrierClustering);
         if (Options::useMovHintRemoval())
-            performMovHintRemoval(dfg);
-        performCleanUp(dfg);
-        performDCE(dfg); // We rely on this to kill dead code that won't be recognized as dead by B3.
-        performStackLayout(dfg);
-        performLivenessAnalysis(dfg);
-        performOSRAvailabilityAnalysis(dfg);
-        performWatchpointCollection(dfg);
+            RUN_PHASE(performMovHintRemoval);
+        RUN_PHASE(performCleanUp);
+        RUN_PHASE(performDCE); // We rely on this to kill dead code that won't be recognized as dead by B3.
+        RUN_PHASE(performStackLayout);
+        RUN_PHASE(performLivenessAnalysis);
+        RUN_PHASE(performOSRAvailabilityAnalysis);
+        RUN_PHASE(performWatchpointCollection);
         
         if (FTL::canCompile(dfg) == FTL::CannotCompile) {
             finalizer = std::make_unique<FailedFinalizer>(*this);
@@ -521,6 +537,8 @@
         RELEASE_ASSERT_NOT_REACHED();
         return FailPath;
     }
+
+#undef RUN_PHASE
 }
 
 bool Plan::isStillValid()

Modified: trunk/Source/_javascript_Core/dfg/DFGSafepoint.cpp (210970 => 210971)


--- trunk/Source/_javascript_Core/dfg/DFGSafepoint.cpp	2017-01-20 18:09:30 UTC (rev 210970)
+++ trunk/Source/_javascript_Core/dfg/DFGSafepoint.cpp	2017-01-20 18:10:55 UTC (rev 210971)
@@ -1,5 +1,5 @@
 /*
- * Copyright (C) 2014, 2016 Apple Inc. All rights reserved.
+ * Copyright (C) 2014-2017 Apple Inc. All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
@@ -80,7 +80,7 @@
     if (ThreadData* data = "" {
         RELEASE_ASSERT(!data->m_safepoint);
         data->m_safepoint = this;
-        data->m_rightToRun.unlock();
+        data->m_rightToRun.unlockFairly();
     }
 }
 

Modified: trunk/Source/_javascript_Core/runtime/Options.h (210970 => 210971)


--- trunk/Source/_javascript_Core/runtime/Options.h	2017-01-20 18:09:30 UTC (rev 210970)
+++ trunk/Source/_javascript_Core/runtime/Options.h	2017-01-20 18:10:55 UTC (rev 210971)
@@ -158,6 +158,7 @@
     v(bool, dumpB3GraphAtEachPhase, false, Normal, "dumps the B3 graph at each phase of compilation") \
     v(bool, dumpAirGraphAtEachPhase, false, Normal, "dumps the Air graph at each phase of compilation") \
     v(bool, verboseDFGByteCodeParsing, false, Normal, nullptr) \
+    v(bool, safepointBeforeEachPhase, true, Normal, nullptr) \
     v(bool, verboseCompilation, false, Normal, nullptr) \
     v(bool, verboseFTLCompilation, false, Normal, nullptr) \
     v(bool, logCompilationChanges, false, Normal, nullptr) \
_______________________________________________
webkit-changes mailing list
[email protected]
https://lists.webkit.org/mailman/listinfo/webkit-changes

Reply via email to