Title: [206682] trunk/Source
Revision
206682
Author
[email protected]
Date
2016-09-30 15:29:24 -0700 (Fri, 30 Sep 2016)

Log Message

B3::moveConstants should be able to edit code to minimize the number of constants
https://bugs.webkit.org/show_bug.cgi?id=162764

Reviewed by Saam Barati.
        
Source/_javascript_Core:

There are some interesting cases where we can reduce the number of constant materializations if
we teach moveConstants() how to edit code. The two examples that this patch supports are:
        
    - Loads and stores from a constant pointer. Since loads and stores get an offset for free
      and the instruction selector is really good at handling it, and since we can query Air to
      see what kinds of offsets are legal, we can sometimes avoid using a constant pointer that
      is specific to the absolute address of that load and instead pick some other constant
      that is within offset distance of ours.
            
    - Add and Sub by a constant (x + c, x - c). Since x + c = x - -c and x - c = x + -c, we can
      flip Add to Sub or vice versa if the negated constant is available.
        
This change makes moveConstants() pick the most dominant constant that works for an value. In
the case of memory accesses, it uses Air::Arg::isValidAddrForm() to work out what other
constants would work. In the case of Add/Sub, it simply looks for the negated constant. This
should result in something like a minimal number of constants since these rules always pick the
most dominant constant that works - so if an Add's constant is already most dominant then
nothing changes, but if the negated one is more dominant then it becomes a Sub.
        
This is a 0.5% speed-up on LongSpider and neutral elsewhere. It's a speed-up because the
absolute address thing reduces the number of address materializations that we have to do, while
the add/sub thing prevents us from having to materialize 0x1000000000000 to box doubles.
However, this may introduce a pathology, which I've filed a bug for: bug 162796.

* b3/B3MoveConstants.cpp:
* b3/B3MoveConstants.h:
* b3/B3UseCounts.h:
* b3/air/AirFixObviousSpills.cpp:
* b3/testb3.cpp:
(JSC::B3::testMoveConstants):
(JSC::B3::run):

Source/WTF:

I thought it would be a good idea to document the fact that dominator traversal happens in a
particular order for a reason.

* wtf/Dominators.h:

Modified Paths

Diff

Modified: trunk/Source/_javascript_Core/ChangeLog (206681 => 206682)


--- trunk/Source/_javascript_Core/ChangeLog	2016-09-30 22:21:59 UTC (rev 206681)
+++ trunk/Source/_javascript_Core/ChangeLog	2016-09-30 22:29:24 UTC (rev 206682)
@@ -1,3 +1,42 @@
+2016-09-30  Filip Pizlo  <[email protected]>
+
+        B3::moveConstants should be able to edit code to minimize the number of constants
+        https://bugs.webkit.org/show_bug.cgi?id=162764
+
+        Reviewed by Saam Barati.
+        
+        There are some interesting cases where we can reduce the number of constant materializations if
+        we teach moveConstants() how to edit code. The two examples that this patch supports are:
+        
+            - Loads and stores from a constant pointer. Since loads and stores get an offset for free
+              and the instruction selector is really good at handling it, and since we can query Air to
+              see what kinds of offsets are legal, we can sometimes avoid using a constant pointer that
+              is specific to the absolute address of that load and instead pick some other constant
+              that is within offset distance of ours.
+            
+            - Add and Sub by a constant (x + c, x - c). Since x + c = x - -c and x - c = x + -c, we can
+              flip Add to Sub or vice versa if the negated constant is available.
+        
+        This change makes moveConstants() pick the most dominant constant that works for an value. In
+        the case of memory accesses, it uses Air::Arg::isValidAddrForm() to work out what other
+        constants would work. In the case of Add/Sub, it simply looks for the negated constant. This
+        should result in something like a minimal number of constants since these rules always pick the
+        most dominant constant that works - so if an Add's constant is already most dominant then
+        nothing changes, but if the negated one is more dominant then it becomes a Sub.
+        
+        This is a 0.5% speed-up on LongSpider and neutral elsewhere. It's a speed-up because the
+        absolute address thing reduces the number of address materializations that we have to do, while
+        the add/sub thing prevents us from having to materialize 0x1000000000000 to box doubles.
+        However, this may introduce a pathology, which I've filed a bug for: bug 162796.
+
+        * b3/B3MoveConstants.cpp:
+        * b3/B3MoveConstants.h:
+        * b3/B3UseCounts.h:
+        * b3/air/AirFixObviousSpills.cpp:
+        * b3/testb3.cpp:
+        (JSC::B3::testMoveConstants):
+        (JSC::B3::run):
+
 2016-09-30  Joseph Pecoraro  <[email protected]>
 
         Fix modules tests after r206653 handle breakpoint locations in import/export statements

Modified: trunk/Source/_javascript_Core/b3/B3MoveConstants.cpp (206681 => 206682)


--- trunk/Source/_javascript_Core/b3/B3MoveConstants.cpp	2016-09-30 22:21:59 UTC (rev 206681)
+++ trunk/Source/_javascript_Core/b3/B3MoveConstants.cpp	2016-09-30 22:29:24 UTC (rev 206682)
@@ -134,10 +134,31 @@
         for (BasicBlock* block : m_proc) {
             for (unsigned valueIndex = 0; valueIndex < block->size(); ++valueIndex) {
                 Value* value = block->at(valueIndex);
-                for (Value* child : value->children()) {
+                
+                // This finds the outermost (best) block last. So, the functor overrides the result
+                // each time it finds something acceptable.
+                auto findBestConstant = [&] (const auto& predicate) -> Value* {
+                    Value* result = nullptr;
+                    dominators.forAllDominatorsOf(
+                        block,
+                        [&] (BasicBlock* dominator) {
+                            for (Value* value : materializations[dominator]) {
+                                if (predicate(value)) {
+                                    result = value;
+                                    break;
+                                }
+                            }
+                        });
+                    return result;
+                };
+                
+                // We call this when we have found a constant that we'd like to use. It's possible that
+                // we have computed that the constant should be meterialized in this block, but we
+                // haven't inserted it yet. This inserts the constant if necessary.
+                auto materialize = [&] (Value* child) {
                     ValueKey key = child->key();
                     if (!filter(key))
-                        continue;
+                        return;
 
                     // If we encounter a fast constant, then it must be canonical, since we already
                     // got rid of the non-canonical ones.
@@ -146,7 +167,7 @@
                     if (child->owner != block) {
                         // This constant isn't our problem. It's going to be materialized in another
                         // block.
-                        continue;
+                        return;
                     }
                     
                     // We're supposed to materialize this constant in this block, and we haven't
@@ -153,7 +174,81 @@
                     // done it yet.
                     m_insertionSet.insertValue(valueIndex, child);
                     child->owner = nullptr;
+                };
+                
+                if (MemoryValue* memoryValue = value->as<MemoryValue>()) {
+                    Value* pointer = memoryValue->lastChild();
+                    if (pointer->hasIntPtr() && filter(pointer->key())) {
+                        auto desiredOffset = [&] (Value* otherPointer) -> intptr_t {
+                            // We would turn this:
+                            //
+                            //     Load(@p, offset = c)
+                            //
+                            // into this:
+                            //
+                            //     Load(@q, offset = ?)
+                            //
+                            // The offset should be c + @p - @q, because then we're loading from:
+                            //
+                            //     @q + c + @p - @q
+                            uintptr_t c = static_cast<uintptr_t>(static_cast<intptr_t>(memoryValue->offset()));
+                            uintptr_t p = pointer->asIntPtr();
+                            uintptr_t q = otherPointer->asIntPtr();
+                            return c + p - q;
+                        };
+                        
+                        Value* bestPointer = findBestConstant(
+                            [&] (Value* candidatePointer) -> bool {
+                                if (!candidatePointer->hasIntPtr())
+                                    return false;
+                                
+                                intptr_t offset = desiredOffset(candidatePointer);
+                                if (!B3::isRepresentableAs<int32_t>(static_cast<int64_t>(offset)))
+                                    return false;
+                                return Air::Arg::isValidAddrForm(
+                                    static_cast<int32_t>(offset),
+                                    Air::Arg::widthForBytes(memoryValue->accessByteSize()));
+                            });
+                        
+                        if (bestPointer) {
+                            memoryValue->lastChild() = bestPointer;
+                            memoryValue->setOffset(desiredOffset(bestPointer));
+                        }
+                    }
+                } else {
+                    switch (value->opcode()) {
+                    case Add:
+                    case Sub: {
+                        Value* addend = value->child(1);
+                        if (!addend->hasInt() || !filter(addend->key()))
+                            break;
+                        int64_t addendConst = addend->asInt();
+                        Value* bestAddend = findBestConstant(
+                            [&] (Value* candidateAddend) -> bool {
+                                if (candidateAddend->type() != addend->type())
+                                    return false;
+                                if (!candidateAddend->hasInt())
+                                    return false;
+                                return candidateAddend == addend
+                                    || candidateAddend->asInt() == -addendConst;
+                            });
+                        if (!bestAddend || bestAddend == addend)
+                            break;
+                        materialize(value->child(0));
+                        materialize(bestAddend);
+                        value->replaceWithIdentity(
+                            m_insertionSet.insert<Value>(
+                                valueIndex, value->opcode() == Add ? Sub : Add, value->origin(),
+                                value->child(0), bestAddend));
+                        break;
+                    }
+                    default:
+                        break;
+                    }
                 }
+                
+                for (Value* child : value->children())
+                    materialize(child);
             }
 
             // We may have some constants that need to be materialized right at the end of this

Modified: trunk/Source/_javascript_Core/b3/B3MoveConstants.h (206681 => 206682)


--- trunk/Source/_javascript_Core/b3/B3MoveConstants.h	2016-09-30 22:21:59 UTC (rev 206681)
+++ trunk/Source/_javascript_Core/b3/B3MoveConstants.h	2016-09-30 22:29:24 UTC (rev 206682)
@@ -1,5 +1,5 @@
 /*
- * Copyright (C) 2015 Apple Inc. All rights reserved.
+ * Copyright (C) 2015-2016 Apple Inc. All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
@@ -33,7 +33,7 @@
 
 // Moves large constants around, with the goal of placing them in the optimal points in the program.
 
-void moveConstants(Procedure&);
+JS_EXPORT_PRIVATE void moveConstants(Procedure&);
 
 } } // namespace JSC::B3
 

Modified: trunk/Source/_javascript_Core/b3/B3UseCounts.h (206681 => 206682)


--- trunk/Source/_javascript_Core/b3/B3UseCounts.h	2016-09-30 22:21:59 UTC (rev 206681)
+++ trunk/Source/_javascript_Core/b3/B3UseCounts.h	2016-09-30 22:29:24 UTC (rev 206682)
@@ -1,5 +1,5 @@
 /*
- * Copyright (C) 2015 Apple Inc. All rights reserved.
+ * Copyright (C) 2015-2016 Apple Inc. All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
@@ -36,8 +36,8 @@
 
 class UseCounts {
 public:
-    UseCounts(Procedure&);
-    ~UseCounts();
+    JS_EXPORT_PRIVATE UseCounts(Procedure&);
+    JS_EXPORT_PRIVATE ~UseCounts();
 
     unsigned numUses(Value* value) const { return m_counts[value].numUses; }
     unsigned numUsingInstructions(Value* value) const { return m_counts[value].numUsingInstructions; }

Modified: trunk/Source/_javascript_Core/b3/testb3.cpp (206681 => 206682)


--- trunk/Source/_javascript_Core/b3/testb3.cpp	2016-09-30 22:21:59 UTC (rev 206681)
+++ trunk/Source/_javascript_Core/b3/testb3.cpp	2016-09-30 22:29:24 UTC (rev 206682)
@@ -42,6 +42,7 @@
 #include "B3LowerToAir.h"
 #include "B3MathExtras.h"
 #include "B3MemoryValue.h"
+#include "B3MoveConstants.h"
 #include "B3Procedure.h"
 #include "B3ReduceStrength.h"
 #include "B3SlotBaseValue.h"
@@ -49,6 +50,7 @@
 #include "B3StackmapGenerationParams.h"
 #include "B3SwitchValue.h"
 #include "B3UpsilonValue.h"
+#include "B3UseCounts.h"
 #include "B3Validate.h"
 #include "B3ValueInlines.h"
 #include "B3VariableValue.h"
@@ -13106,6 +13108,67 @@
     checkDoesNotUseInstruction(*code, "dmb    ishst");
 }
 
+void testMoveConstants()
+{
+    auto check = [] (Procedure& proc) {
+        proc.resetReachability();
+        
+        if (shouldBeVerbose()) {
+            dataLog("IR before:\n");
+            dataLog(proc);
+        }
+        
+        moveConstants(proc);
+        
+        if (shouldBeVerbose()) {
+            dataLog("IR after:\n");
+            dataLog(proc);
+        }
+        
+        UseCounts useCounts(proc);
+        unsigned count = 0;
+        for (Value* value : proc.values()) {
+            if (useCounts.numUses(value) && value->hasInt64())
+                count++;
+        }
+        
+        if (count == 1)
+            return;
+        
+        crashLock.lock();
+        dataLog("Fail in testMoveConstants: got more than one Const64:\n");
+        dataLog(proc);
+        CRASH();
+    };
+
+    {
+        Procedure proc;
+        BasicBlock* root = proc.addBlock();
+        Value* a = root->appendNew<MemoryValue>(
+            proc, Load, Int32, Origin(), 
+            root->appendNew<ConstPtrValue>(proc, Origin(), 0x123412341234));
+        Value* b = root->appendNew<MemoryValue>(
+            proc, Load, Int32, Origin(),
+            root->appendNew<ConstPtrValue>(proc, Origin(), 0x123412341334));
+        root->appendNew<CCallValue>(proc, Void, Origin(), a, b);
+        root->appendNew<Value>(proc, Return, Origin());
+        check(proc);
+    }
+    
+    {
+        Procedure proc;
+        BasicBlock* root = proc.addBlock();
+        Value* x = root->appendNew<ArgumentRegValue>(proc, Origin(), GPRInfo::argumentGPR0);
+        Value* a = root->appendNew<Value>(
+            proc, Add, Origin(), x, root->appendNew<ConstPtrValue>(proc, Origin(), 0x123412341234));
+        Value* b = root->appendNew<Value>(
+            proc, Add, Origin(), x, root->appendNew<ConstPtrValue>(proc, Origin(), -0x123412341234));
+        root->appendNew<CCallValue>(proc, Void, Origin(), a, b);
+        root->appendNew<Value>(proc, Return, Origin());
+        check(proc);
+    }
+}
+
 // Make sure the compiler does not try to optimize anything out.
 NEVER_INLINE double zero()
 {
@@ -14546,6 +14609,7 @@
     RUN(testMemoryFence());
     RUN(testStoreFence());
     RUN(testLoadFence());
+    RUN(testMoveConstants());
     
     if (tasks.isEmpty())
         usage();

Modified: trunk/Source/WTF/ChangeLog (206681 => 206682)


--- trunk/Source/WTF/ChangeLog	2016-09-30 22:21:59 UTC (rev 206681)
+++ trunk/Source/WTF/ChangeLog	2016-09-30 22:29:24 UTC (rev 206682)
@@ -1,3 +1,15 @@
+2016-09-30  Filip Pizlo  <[email protected]>
+
+        B3::moveConstants should be able to edit code to minimize the number of constants
+        https://bugs.webkit.org/show_bug.cgi?id=162764
+
+        Reviewed by Saam Barati.
+        
+        I thought it would be a good idea to document the fact that dominator traversal happens in a
+        particular order for a reason.
+
+        * wtf/Dominators.h:
+
 2016-09-29  Filip Pizlo  <[email protected]>
 
         Air should have a way of expressing additional instruction flags

Modified: trunk/Source/WTF/wtf/Dominators.h (206681 => 206682)


--- trunk/Source/WTF/wtf/Dominators.h	2016-09-30 22:21:59 UTC (rev 206681)
+++ trunk/Source/WTF/wtf/Dominators.h	2016-09-30 22:29:24 UTC (rev 206682)
@@ -136,6 +136,9 @@
             functor(block);
     }
     
+    // Note: This will visit the dominators starting with the 'to' node and moving up the idom tree
+    // until it gets to the root. Some clients of this function, like B3::moveConstants(), rely on this
+    // order.
     template<typename Functor>
     void forAllDominatorsOf(typename Graph::Node to, const Functor& functor) const
     {
_______________________________________________
webkit-changes mailing list
[email protected]
https://lists.webkit.org/mailman/listinfo/webkit-changes

Reply via email to