Title: [214901] trunk/Source/_javascript_Core
Revision
214901
Author
[email protected]
Date
2017-04-04 14:48:41 -0700 (Tue, 04 Apr 2017)

Log Message

Air::lowerAfterRegAlloc should bail early if it finds no Shuffles or ColdCCalls
https://bugs.webkit.org/show_bug.cgi?id=170305

Reviewed by Saam Barati.
        
This reduces and sometimes completely eliminates the need to run lowerAfterRegAlloc().
        
This lowers the Shuffle for the arguments of a CCall before register allocation unless
the CCall arguments require a real shuffle (like if the CCall arguments were argument
registers). This lowers a ColdCCall like a CCall for optLevel<2.
        
Finally, lowerAfterRegAlloc() now checks if there are any Shuffles or CCalls before it
does anything else. For wasm at -O1, this means that the phase doesn't run at all. This
is a ~3% wasm -O1 compile time progression.
        
To make this easy, I changed optLevel into a property of Procedure and Code rather than
an argument we thread through everything. I like how Procedure and Code are dumping
ground classes. This does not bother me. Note that I cloned optLevel into Procedure and
Code so that it's cheap to query inside Air phases.

* b3/B3Compile.cpp:
(JSC::B3::compile):
* b3/B3Compile.h:
* b3/B3Generate.cpp:
(JSC::B3::prepareForGeneration):
(JSC::B3::generateToAir):
* b3/B3Generate.h:
* b3/B3Procedure.cpp:
(JSC::B3::Procedure::setOptLevel):
* b3/B3Procedure.h:
(JSC::B3::Procedure::optLevel):
* b3/air/AirCode.h:
(JSC::B3::Air::Code::isPinned):
(JSC::B3::Air::Code::setOptLevel):
(JSC::B3::Air::Code::optLevel):
* b3/air/AirEmitShuffle.cpp:
(JSC::B3::Air::ShufflePair::bank):
(JSC::B3::Air::ShufflePair::opcode):
(JSC::B3::Air::ShufflePair::inst):
(JSC::B3::Air::emitShuffle):
* b3/air/AirEmitShuffle.h:
(JSC::B3::Air::moveFor):
* b3/air/AirGenerate.cpp:
(JSC::B3::Air::prepareForGeneration):
* b3/air/AirGenerate.h:
* b3/air/AirLowerAfterRegAlloc.cpp:
(JSC::B3::Air::lowerAfterRegAlloc):
* b3/air/AirLowerMacros.cpp:
(JSC::B3::Air::lowerMacros):
* b3/testb3.cpp:
(JSC::B3::compileProc):
* wasm/WasmB3IRGenerator.cpp:
(JSC::Wasm::parseAndCompile):

Modified Paths

Diff

Modified: trunk/Source/_javascript_Core/ChangeLog (214900 => 214901)


--- trunk/Source/_javascript_Core/ChangeLog	2017-04-04 21:32:15 UTC (rev 214900)
+++ trunk/Source/_javascript_Core/ChangeLog	2017-04-04 21:48:41 UTC (rev 214901)
@@ -1,5 +1,61 @@
 2017-04-04  Filip Pizlo  <[email protected]>
 
+        Air::lowerAfterRegAlloc should bail early if it finds no Shuffles or ColdCCalls
+        https://bugs.webkit.org/show_bug.cgi?id=170305
+
+        Reviewed by Saam Barati.
+        
+        This reduces and sometimes completely eliminates the need to run lowerAfterRegAlloc().
+        
+        This lowers the Shuffle for the arguments of a CCall before register allocation unless
+        the CCall arguments require a real shuffle (like if the CCall arguments were argument
+        registers). This lowers a ColdCCall like a CCall for optLevel<2.
+        
+        Finally, lowerAfterRegAlloc() now checks if there are any Shuffles or CCalls before it
+        does anything else. For wasm at -O1, this means that the phase doesn't run at all. This
+        is a ~3% wasm -O1 compile time progression.
+        
+        To make this easy, I changed optLevel into a property of Procedure and Code rather than
+        an argument we thread through everything. I like how Procedure and Code are dumping
+        ground classes. This does not bother me. Note that I cloned optLevel into Procedure and
+        Code so that it's cheap to query inside Air phases.
+
+        * b3/B3Compile.cpp:
+        (JSC::B3::compile):
+        * b3/B3Compile.h:
+        * b3/B3Generate.cpp:
+        (JSC::B3::prepareForGeneration):
+        (JSC::B3::generateToAir):
+        * b3/B3Generate.h:
+        * b3/B3Procedure.cpp:
+        (JSC::B3::Procedure::setOptLevel):
+        * b3/B3Procedure.h:
+        (JSC::B3::Procedure::optLevel):
+        * b3/air/AirCode.h:
+        (JSC::B3::Air::Code::isPinned):
+        (JSC::B3::Air::Code::setOptLevel):
+        (JSC::B3::Air::Code::optLevel):
+        * b3/air/AirEmitShuffle.cpp:
+        (JSC::B3::Air::ShufflePair::bank):
+        (JSC::B3::Air::ShufflePair::opcode):
+        (JSC::B3::Air::ShufflePair::inst):
+        (JSC::B3::Air::emitShuffle):
+        * b3/air/AirEmitShuffle.h:
+        (JSC::B3::Air::moveFor):
+        * b3/air/AirGenerate.cpp:
+        (JSC::B3::Air::prepareForGeneration):
+        * b3/air/AirGenerate.h:
+        * b3/air/AirLowerAfterRegAlloc.cpp:
+        (JSC::B3::Air::lowerAfterRegAlloc):
+        * b3/air/AirLowerMacros.cpp:
+        (JSC::B3::Air::lowerMacros):
+        * b3/testb3.cpp:
+        (JSC::B3::compileProc):
+        * wasm/WasmB3IRGenerator.cpp:
+        (JSC::Wasm::parseAndCompile):
+
+2017-04-04  Filip Pizlo  <[email protected]>
+
         Don't need to Air::reportUsedRegisters for wasm at -O1
         https://bugs.webkit.org/show_bug.cgi?id=170459
 

Modified: trunk/Source/_javascript_Core/b3/B3Compile.cpp (214900 => 214901)


--- trunk/Source/_javascript_Core/b3/B3Compile.cpp	2017-04-04 21:32:15 UTC (rev 214900)
+++ trunk/Source/_javascript_Core/b3/B3Compile.cpp	2017-04-04 21:48:41 UTC (rev 214901)
@@ -38,11 +38,11 @@
 
 namespace JSC { namespace B3 {
 
-Compilation compile(Procedure& proc, unsigned optLevel)
+Compilation compile(Procedure& proc)
 {
     TimingScope timingScope("Compilation");
     
-    prepareForGeneration(proc, optLevel);
+    prepareForGeneration(proc);
     
     CCallHelpers jit;
     generate(proc, jit);

Modified: trunk/Source/_javascript_Core/b3/B3Compile.h (214900 => 214901)


--- trunk/Source/_javascript_Core/b3/B3Compile.h	2017-04-04 21:32:15 UTC (rev 214900)
+++ trunk/Source/_javascript_Core/b3/B3Compile.h	2017-04-04 21:48:41 UTC (rev 214901)
@@ -46,7 +46,7 @@
 // Then you keep the Compilation object alive for as long as you want to be able to run the code.
 // If this API feels too high-level, you can use B3::generate() directly.
 
-JS_EXPORT_PRIVATE Compilation compile(Procedure&, unsigned optLevel = defaultOptLevel());
+JS_EXPORT_PRIVATE Compilation compile(Procedure&);
 
 } } // namespace JSC::B3
 

Modified: trunk/Source/_javascript_Core/b3/B3Generate.cpp (214900 => 214901)


--- trunk/Source/_javascript_Core/b3/B3Generate.cpp	2017-04-04 21:32:15 UTC (rev 214900)
+++ trunk/Source/_javascript_Core/b3/B3Generate.cpp	2017-04-04 21:48:41 UTC (rev 214901)
@@ -52,12 +52,12 @@
 
 namespace JSC { namespace B3 {
 
-void prepareForGeneration(Procedure& procedure, unsigned optLevel)
+void prepareForGeneration(Procedure& procedure)
 {
     TimingScope timingScope("prepareForGeneration");
 
-    generateToAir(procedure, optLevel);
-    Air::prepareForGeneration(procedure.code(), optLevel);
+    generateToAir(procedure);
+    Air::prepareForGeneration(procedure.code());
 }
 
 void generate(Procedure& procedure, CCallHelpers& jit)
@@ -65,7 +65,7 @@
     Air::generate(procedure.code(), jit);
 }
 
-void generateToAir(Procedure& procedure, unsigned optLevel)
+void generateToAir(Procedure& procedure)
 {
     TimingScope timingScope("generateToAir");
     
@@ -80,7 +80,7 @@
     if (shouldValidateIR())
         validate(procedure);
 
-    if (optLevel >= 2) {
+    if (procedure.optLevel() >= 2) {
         reduceDoubleToFloat(procedure);
         reduceStrength(procedure);
         eliminateCommonSubexpressions(procedure);
@@ -91,7 +91,7 @@
         
         // FIXME: Add more optimizations here.
         // https://bugs.webkit.org/show_bug.cgi?id=150507
-    } else if (optLevel >= 1) {
+    } else if (procedure.optLevel() >= 1) {
         // FIXME: Explore better "quick mode" optimizations.
         reduceStrength(procedure);
     }
@@ -99,7 +99,7 @@
     // This puts the IR in quirks mode.
     lowerMacros(procedure);
 
-    if (optLevel >= 2) {
+    if (procedure.optLevel() >= 2) {
         reduceStrength(procedure);
 
         // FIXME: Add more optimizations here.

Modified: trunk/Source/_javascript_Core/b3/B3Generate.h (214900 => 214901)


--- trunk/Source/_javascript_Core/b3/B3Generate.h	2017-04-04 21:32:15 UTC (rev 214900)
+++ trunk/Source/_javascript_Core/b3/B3Generate.h	2017-04-04 21:48:41 UTC (rev 214901)
@@ -40,7 +40,7 @@
 
 // This takes a B3::Procedure, optimizes it in-place, lowers it to Air, and prepares the Air for
 // generation.
-JS_EXPORT_PRIVATE void prepareForGeneration(Procedure&, unsigned optLevel = defaultOptLevel());
+JS_EXPORT_PRIVATE void prepareForGeneration(Procedure&);
 
 // This takes a B3::Procedure that has been prepared for generation (i.e. it has been lowered to Air and
 // the Air has been prepared for generation) and generates it. This is the equivalent of calling
@@ -50,7 +50,7 @@
 // This takes a B3::Procedure, optimizes it in-place, and lowers it to Air. You can then generate
 // the Air to machine code using Air::prepareForGeneration() and Air::generate() on the Procedure's
 // code().
-void generateToAir(Procedure&, unsigned optLevel = defaultOptLevel());
+void generateToAir(Procedure&);
 
 } } // namespace JSC::B3
 

Modified: trunk/Source/_javascript_Core/b3/B3Procedure.cpp (214900 => 214901)


--- trunk/Source/_javascript_Core/b3/B3Procedure.cpp	2017-04-04 21:32:15 UTC (rev 214900)
+++ trunk/Source/_javascript_Core/b3/B3Procedure.cpp	2017-04-04 21:48:41 UTC (rev 214901)
@@ -334,6 +334,12 @@
     code().pinRegister(reg);
 }
 
+void Procedure::setOptLevel(unsigned optLevel)
+{
+    m_optLevel = optLevel;
+    code().setOptLevel(optLevel);
+}
+
 unsigned Procedure::frameSize() const
 {
     return code().frameSize();

Modified: trunk/Source/_javascript_Core/b3/B3Procedure.h (214900 => 214901)


--- trunk/Source/_javascript_Core/b3/B3Procedure.h	2017-04-04 21:32:15 UTC (rev 214900)
+++ trunk/Source/_javascript_Core/b3/B3Procedure.h	2017-04-04 21:48:41 UTC (rev 214901)
@@ -233,6 +233,9 @@
     // This tells the register allocators to stay away from this register.
     JS_EXPORT_PRIVATE void pinRegister(Reg);
     
+    JS_EXPORT_PRIVATE void setOptLevel(unsigned value);
+    unsigned optLevel() const { return m_optLevel; }
+    
     // You can turn off used registers calculation. This may speed up compilation a bit. But if
     // you turn it off then you cannot use StackmapGenerationParams::usedRegisters() or
     // StackmapGenerationParams::unavailableRegisters().
@@ -273,6 +276,7 @@
     RefPtr<SharedTask<void(PrintStream&, Origin)>> m_originPrinter;
     const void* m_frontendData;
     PCToOriginMap m_pcToOriginMap;
+    unsigned m_optLevel { defaultOptLevel() };
     bool m_needsUsedRegisters { true };
     bool m_hasQuirks { false };
 };

Modified: trunk/Source/_javascript_Core/b3/air/AirCode.h (214900 => 214901)


--- trunk/Source/_javascript_Core/b3/air/AirCode.h	2017-04-04 21:32:15 UTC (rev 214900)
+++ trunk/Source/_javascript_Core/b3/air/AirCode.h	2017-04-04 21:48:41 UTC (rev 214901)
@@ -88,9 +88,11 @@
     const RegisterSet& mutableRegs() const { return m_mutableRegs; }
     
     bool isPinned(Reg reg) const { return !mutableRegs().get(reg); }
-    
     void pinRegister(Reg);
     
+    void setOptLevel(unsigned optLevel) { m_optLevel = optLevel; }
+    unsigned optLevel() const { return m_optLevel; }
+    
     bool needsUsedRegisters() const;
 
     JS_EXPORT_PRIVATE BasicBlock* addBlock(double frequency = 1);
@@ -322,6 +324,7 @@
     RefPtr<WasmBoundsCheckGenerator> m_wasmBoundsCheckGenerator;
     const char* m_lastPhaseName;
     std::unique_ptr<Disassembler> m_disassembler;
+    unsigned m_optLevel { defaultOptLevel() };
 };
 
 } } } // namespace JSC::B3::Air

Modified: trunk/Source/_javascript_Core/b3/air/AirEmitShuffle.cpp (214900 => 214901)


--- trunk/Source/_javascript_Core/b3/air/AirEmitShuffle.cpp	2017-04-04 21:32:15 UTC (rev 214900)
+++ trunk/Source/_javascript_Core/b3/air/AirEmitShuffle.cpp	2017-04-04 21:48:41 UTC (rev 214901)
@@ -65,6 +65,36 @@
 
 } // anonymous namespace
 
+Bank ShufflePair::bank() const
+{
+    if (src().isMemory() && dst().isMemory() && width() > pointerWidth()) {
+        // 8-byte memory-to-memory moves on a 32-bit platform are best handled as float moves.
+        return FP;
+    }
+    
+    if (src().isGP() && dst().isGP()) {
+        // This means that gpPairs gets memory-to-memory shuffles. The assumption is that we
+        // can do that more efficiently using GPRs, except in the special case above.
+        return GP;
+    }
+    
+    return FP;
+}
+
+Opcode ShufflePair::opcode() const
+{
+    return moveFor(bank(), width());
+}
+
+Inst ShufflePair::inst(Code* code, Value* origin) const
+{
+    if (UNLIKELY(src().isMemory() && dst().isMemory())) {
+        RELEASE_ASSERT(code);
+        return Inst(opcode(), origin, src(), dst(), code->newTmp(bank()));
+    }
+    return Inst(opcode(), origin, src(), dst());
+}
+
 void ShufflePair::dump(PrintStream& out) const
 {
     out.print(width(), ":", src(), "=>", dst());
@@ -261,14 +291,7 @@
     // ends with a register. We search for such a register right now.
 
     auto moveForWidth = [&] (Width width) -> Opcode {
-        switch (width) {
-        case Width32:
-            return bank == GP ? Move32 : MoveFloat;
-        case Width64:
-            return bank == GP ? Move : MoveDouble;
-        default:
-            RELEASE_ASSERT_NOT_REACHED();
-        }
+        return moveFor(bank, width);
     };
 
     Opcode conservativeMove = moveForWidth(conservativeWidth(bank));
@@ -520,15 +543,14 @@
     Vector<ShufflePair> gpPairs;
     Vector<ShufflePair> fpPairs;
     for (const ShufflePair& pair : pairs) {
-        if (pair.src().isMemory() && pair.dst().isMemory() && pair.width() > pointerWidth()) {
-            // 8-byte memory-to-memory moves on a 32-bit platform are best handled as float moves.
-            fpPairs.append(pair);
-        } else if (pair.src().isGP() && pair.dst().isGP()) {
-            // This means that gpPairs gets memory-to-memory shuffles. The assumption is that we
-            // can do that more efficiently using GPRs, except in the special case above.
+        switch (pair.bank()) {
+        case GP:
             gpPairs.append(pair);
-        } else
+            break;
+        case FP:
             fpPairs.append(pair);
+            break;
+        }
     }
 
     Vector<Inst> result;

Modified: trunk/Source/_javascript_Core/b3/air/AirEmitShuffle.h (214900 => 214901)


--- trunk/Source/_javascript_Core/b3/air/AirEmitShuffle.h	2017-04-04 21:32:15 UTC (rev 214900)
+++ trunk/Source/_javascript_Core/b3/air/AirEmitShuffle.h	2017-04-04 21:48:41 UTC (rev 214901)
@@ -39,6 +39,19 @@
 
 class Code;
 
+inline Opcode moveFor(Bank bank, Width width)
+{
+    switch (width) {
+    case Width32:
+        return bank == GP ? Move32 : MoveFloat;
+    case Width64:
+        return bank == GP ? Move : MoveDouble;
+    default:
+        RELEASE_ASSERT_NOT_REACHED();
+        return Oops;
+    }
+}
+
 class ShufflePair {
 public:
     ShufflePair()
@@ -58,6 +71,14 @@
     // The width determines the kind of move we do. You can only choose Width32 or Width64 right now.
     // For GP, it picks between Move32 and Move. For FP, it picks between MoveFloat and MoveDouble.
     Width width() const { return m_width; }
+    
+    Bank bank() const;
+    Opcode opcode() const;
+    
+    // Creates an instruction for the move represented by this shuffle pair. You need to pass
+    // Code if this is a memory->memory pair. You can pass null if you know that it's not. In
+    // fact, passing null is a good way to assert that this is not a memory->memory pair.
+    Inst inst(Code*, Value* origin) const;
 
     void dump(PrintStream&) const;
     

Modified: trunk/Source/_javascript_Core/b3/air/AirGenerate.cpp (214900 => 214901)


--- trunk/Source/_javascript_Core/b3/air/AirGenerate.cpp	2017-04-04 21:32:15 UTC (rev 214900)
+++ trunk/Source/_javascript_Core/b3/air/AirGenerate.cpp	2017-04-04 21:48:41 UTC (rev 214901)
@@ -58,7 +58,7 @@
 
 namespace JSC { namespace B3 { namespace Air {
 
-void prepareForGeneration(Code& code, unsigned optLevel)
+void prepareForGeneration(Code& code)
 {
     TimingScope timingScope("Air::prepareForGeneration");
     
@@ -90,7 +90,7 @@
     // For debugging, you can use spillEverything() to put everything to the stack between each Inst.
     if (Options::airSpillsEverything())
         spillEverything(code);
-    else if (optLevel >= 2)
+    else if (code.optLevel() >= 2)
         allocateRegistersByGraphColoring(code);
     else
         allocateRegistersByLinearScan(code);
@@ -100,7 +100,7 @@
         logRegisterPressure(code);
     }
     
-    if (optLevel >= 2) {
+    if (code.optLevel() >= 2) {
         // This replaces uses of spill slots with registers or constants if possible. It does this by
         // minimizing the amount that we perturb the already-chosen register allocation. It may extend
         // the live ranges of registers though.
@@ -124,7 +124,7 @@
 
     // This is needed to satisfy a requirement of B3::StackmapValue. This also removes dead
     // code. We can avoid running this when certain optimizations are disabled.
-    if (optLevel >= 2 || code.needsUsedRegisters())
+    if (code.optLevel() >= 2 || code.needsUsedRegisters())
         reportUsedRegisters(code);
 
     // Attempt to remove false dependencies between instructions created by partial register changes.

Modified: trunk/Source/_javascript_Core/b3/air/AirGenerate.h (214900 => 214901)


--- trunk/Source/_javascript_Core/b3/air/AirGenerate.h	2017-04-04 21:32:15 UTC (rev 214900)
+++ trunk/Source/_javascript_Core/b3/air/AirGenerate.h	2017-04-04 21:48:41 UTC (rev 214901)
@@ -39,7 +39,7 @@
 
 // This takes an Air::Code that hasn't had any stack allocation and optionally hasn't had any
 // register allocation and does both of those things.
-JS_EXPORT_PRIVATE void prepareForGeneration(Code&, unsigned optLevel = defaultOptLevel());
+JS_EXPORT_PRIVATE void prepareForGeneration(Code&);
 
 // This generates the code using the given CCallHelpers instance. Note that this may call callbacks
 // in the supplied code as it is generating.

Modified: trunk/Source/_javascript_Core/b3/air/AirLowerAfterRegAlloc.cpp (214900 => 214901)


--- trunk/Source/_javascript_Core/b3/air/AirLowerAfterRegAlloc.cpp	2017-04-04 21:32:15 UTC (rev 214900)
+++ trunk/Source/_javascript_Core/b3/air/AirLowerAfterRegAlloc.cpp	2017-04-04 21:48:41 UTC (rev 214901)
@@ -56,13 +56,23 @@
     if (verbose)
         dataLog("Code before lowerAfterRegAlloc:\n", code);
     
-    // FIXME:
-    // 1) This should bail early if there are no Shuffles or ColdCCalls.
-    //    https://bugs.webkit.org/show_bug.cgi?id=170305
-    // 2) We should not introduce Shuffles for normal calls.
-    //    https://bugs.webkit.org/show_bug.cgi?id=170306
-    // 3) We should emit ColdCCall only at optLevel==1.
-    //    https://bugs.webkit.org/show_bug.cgi?id=170307
+    auto isRelevant = [] (Inst& inst) -> bool {
+        return inst.kind.opcode == Shuffle || inst.kind.opcode == ColdCCall;
+    };
+    
+    bool haveAnyRelevant = false;
+    for (BasicBlock* block : code) {
+        for (Inst& inst : *block) {
+            if (isRelevant(inst)) {
+                haveAnyRelevant = true;
+                break;
+            }
+        }
+        if (haveAnyRelevant)
+            break;
+    }
+    if (!haveAnyRelevant)
+        return;
 
     HashMap<Inst*, RegisterSet> usedRegisters;
 
@@ -75,9 +85,7 @@
             
             RegisterSet set;
 
-            bool isRelevant = inst.kind.opcode == Shuffle || inst.kind.opcode == ColdCCall;
-            
-            if (isRelevant) {
+            if (isRelevant(inst)) {
                 for (Reg reg : localCalc.live())
                     set.set(reg);
             }
@@ -84,7 +92,7 @@
             
             localCalc.execute(instIndex);
 
-            if (isRelevant)
+            if (isRelevant(inst))
                 usedRegisters.add(&inst, set);
         }
     }

Modified: trunk/Source/_javascript_Core/b3/air/AirLowerMacros.cpp (214900 => 214901)


--- trunk/Source/_javascript_Core/b3/air/AirLowerMacros.cpp	2017-04-04 21:32:15 UTC (rev 214900)
+++ trunk/Source/_javascript_Core/b3/air/AirLowerMacros.cpp	2017-04-04 21:48:41 UTC (rev 214901)
@@ -30,6 +30,7 @@
 
 #include "AirCCallingConvention.h"
 #include "AirCode.h"
+#include "AirEmitShuffle.h"
 #include "AirInsertionSet.h"
 #include "AirInstInlines.h"
 #include "AirPhaseScope.h"
@@ -46,23 +47,46 @@
     for (BasicBlock* block : code) {
         for (unsigned instIndex = 0; instIndex < block->size(); ++instIndex) {
             Inst& inst = block->at(instIndex);
-
-            switch (inst.kind.opcode) {
-            case CCall: {
+            
+            auto handleCall = [&] () {
                 CCallValue* value = inst.origin->as<CCallValue>();
                 Kind oldKind = inst.kind;
 
                 Vector<Arg> destinations = computeCCallingConvention(code, value);
-
-                Inst shuffleArguments(Shuffle, value);
+                
                 unsigned offset = value->type() == Void ? 0 : 1;
+                Vector<ShufflePair, 16> shufflePairs;
+                bool hasRegisterSource = false;
                 for (unsigned i = 1; i < destinations.size(); ++i) {
                     Value* child = value->child(i);
-                    shuffleArguments.args.append(inst.args[offset + i]);
-                    shuffleArguments.args.append(destinations[i]);
-                    shuffleArguments.args.append(Arg::widthArg(widthForType(child->type())));
+                    ShufflePair pair(inst.args[offset + i], destinations[i], widthForType(child->type()));
+                    shufflePairs.append(pair);
+                    hasRegisterSource |= pair.src().isReg();
                 }
-                insertionSet.insertInst(instIndex, WTFMove(shuffleArguments));
+                
+                if (UNLIKELY(hasRegisterSource))
+                    insertionSet.insertInst(instIndex, createShuffle(inst.origin, Vector<ShufflePair>(shufflePairs)));
+                else {
+                    // If none of the inputs are registers, then we can efficiently lower this
+                    // shuffle before register allocation. First we lower all of the moves to
+                    // memory, in the hopes that this is the last use of the operands. This
+                    // avoids creating interference between argument registers and arguments
+                    // that don't go into argument registers.
+                    for (ShufflePair& pair : shufflePairs) {
+                        if (pair.dst().isMemory())
+                            insertionSet.insertInst(instIndex, pair.inst(&code, inst.origin));
+                    }
+                    
+                    // Fill the argument registers by starting with the first one. This avoids
+                    // creating interference between things passed to low-numbered argument
+                    // registers and high-numbered argument registers. The assumption here is
+                    // that lower-numbered argument registers are more likely to be
+                    // incidentally clobbered.
+                    for (ShufflePair& pair : shufflePairs) {
+                        if (!pair.dst().isMemory())
+                            insertionSet.insertInst(instIndex, pair.inst(nullptr, inst.origin));
+                    }
+                }
 
                 // Indicate that we're using our original callee argument.
                 destinations[0] = inst.args[0];
@@ -91,8 +115,17 @@
                     insertionSet.insert(instIndex + 1, Move, value, result, resultDst);
                     break;
                 }
+            };
+
+            switch (inst.kind.opcode) {
+            case ColdCCall:
+                if (code.optLevel() < 2)
+                    handleCall();
                 break;
-            }
+                
+            case CCall:
+                handleCall();
+                break;
 
             default:
                 break;

Modified: trunk/Source/_javascript_Core/b3/testb3.cpp (214900 => 214901)


--- trunk/Source/_javascript_Core/b3/testb3.cpp	2017-04-04 21:32:15 UTC (rev 214900)
+++ trunk/Source/_javascript_Core/b3/testb3.cpp	2017-04-04 21:48:41 UTC (rev 214901)
@@ -119,7 +119,8 @@
 
 std::unique_ptr<Compilation> compileProc(Procedure& procedure, unsigned optLevel = defaultOptLevel())
 {
-    return std::make_unique<Compilation>(B3::compile(procedure, optLevel));
+    procedure.setOptLevel(optLevel);
+    return std::make_unique<Compilation>(B3::compile(procedure));
 }
 
 template<typename T, typename... Arguments>

Modified: trunk/Source/_javascript_Core/wasm/WasmB3IRGenerator.cpp (214900 => 214901)


--- trunk/Source/_javascript_Core/wasm/WasmB3IRGenerator.cpp	2017-04-04 21:32:15 UTC (rev 214900)
+++ trunk/Source/_javascript_Core/wasm/WasmB3IRGenerator.cpp	2017-04-04 21:48:41 UTC (rev 214901)
@@ -1288,6 +1288,8 @@
     // don't strictly need to run Air::reportUsedRegisters(), which saves a bit of CPU time at
     // optLevel=1.
     procedure.setNeedsUsedRegisters(false);
+    
+    procedure.setOptLevel(optLevel);
 
     B3IRGenerator context(info, procedure, result.get(), unlinkedWasmToWasmCalls, mode);
     FunctionParser<B3IRGenerator> parser(context, functionStart, functionLength, signature, info, moduleSignatureIndicesToUniquedSignatureIndices);
@@ -1304,7 +1306,7 @@
     dataLogIf(verbose, "Post SSA: ", procedure);
     
     {
-        B3::prepareForGeneration(procedure, optLevel);
+        B3::prepareForGeneration(procedure);
         B3::generate(procedure, *compilationContext.wasmEntrypointJIT);
         compilationContext.wasmEntrypointByproducts = procedure.releaseByproducts();
         result->wasmEntrypoint.calleeSaveRegisters = procedure.calleeSaveRegisters();
_______________________________________________
webkit-changes mailing list
[email protected]
https://lists.webkit.org/mailman/listinfo/webkit-changes

Reply via email to