Title: [279193] trunk/Source
Revision
279193
Author
[email protected]
Date
2021-06-23 15:09:31 -0700 (Wed, 23 Jun 2021)

Log Message

Add a new pattern to instruction selector to utilize UBFIZ supported by ARM64
https://bugs.webkit.org/show_bug.cgi?id=227204

Patch by Yijia Huang <[email protected]> on 2021-06-23
Reviewed by Filip Pizlo.

Source/_javascript_Core:

This patch includes three parts:
    A) Add UBFIZ to instruction selector.
    B) Fix UBFX, introduced in https://bugs.webkit.org/show_bug.cgi?id=226984,
       to match all patterns.
    C) Fix error condition in one strength reduction introduced
       in https://bugs.webkit.org/show_bug.cgi?id=227138.

Part A
Unsigned Bitfield Insert in Zero(UBFIZ), supported by ARM64, zeros the
destination register and copies any number of contiguous bits from a
source register into any position in the destination register. The
instruction selector can utilize this to lowering certain patterns in
B3 IR before further Air optimization.

Given the operation: ubfiz d, n, lsb, width

This is equivalent to "d = (n << lsb) & (((1 << width) - 1) << lsb)".
Since wasm introduces constant folding, then the matched patterns would be:
1.1 d = (n << lsb) & maskShift
1.2 d = maskShift & (n << lsb)

2.1 d = (n & mask) << lsb
2.2 d = (mask & n) << lsb

Where:
    maskShift = mask << lsb
    mask = (1 << width) - 1

To make the pattern matching in instruction selection beneficial to JIT, these
constraints should be introduced:
    1. 0 <= lsb < datasize
    2. 0 < width < datasize
    3. lsb + width <= datasize

Choose (n & mask) << lsb as the canonical form and introduce a strength reduction.
Turn this: (n << lsb) & maskShift
Into this: (n & mask) << lsb

Given B3 IR:
Int @0 = ArgumentReg(%x0)
Int @1 = lsb
Int @2 = 0b0110
Int @3 = Shl(@0, @1)
Int @4 = BitAnd(@3, @2)
Void@5 = Return(@4, Terminal)

Before Adding UBFIZ Pattern:
// Old optimized AIR
Lshift  %x0, $62, %x0, @3
And  0b0110, %x0, %x0, @4
Ret     %x0,           @5

After Adding UBFIZ Pattern:
// New optimized AIR
Ubfiz %x0, lsb, 2, %x0, @4
Ret   %x0,              @5

Part B
Fix UBFX to match both patterns:
dest = (src >> lsb) & mask
dest = mask & (src >> lsb)

Where:
1. mask = (1 << width) - 1
2. 0 <= lsb < datasize
3. 0 < width < datasize
4. lsb + width <= datasize

Part C
Fix one B3 strength reduction.
Turn this: (src >> shiftAmount) & mask
Into this: src >> shiftAmount

With updated constraints:
1. mask = (1 << width) - 1
2. 0 <= shiftAmount < datasize
3. 0 < width < datasize
4. shiftAmount + width >= datasize

* assembler/MacroAssemblerARM64.h:
(JSC::MacroAssemblerARM64::ubfiz32):
(JSC::MacroAssemblerARM64::ubfiz64):
* assembler/testmasm.cpp:
(JSC::testUbfiz32):
(JSC::testUbfiz64):
* b3/B3LowerToAir.cpp:
* b3/air/AirOpcode.opcodes:
* b3/testb3.h:
* b3/testb3_2.cpp:
(testUbfx32ArgLeft):
(testUbfx32ArgRight):
(testUbfx64ArgLeft):
(testUbfx64ArgRight):
(testUbfiz32ArgLeft):
(testUbfiz32ArgRight):
(testUbfiz64ArgLeft):
(testUbfiz64ArgRight):
(addBitTests):
(testUbfx32): Deleted.
(testUbfx32PatternMatch): Deleted.
(testUbfx64): Deleted.
(testUbfx64PatternMatch): Deleted.

Source/WTF:

Add functions to count the consecutive zero bits (trailing) on the
right with modulus division and lookup. Reference: Bit Twiddling Hacks.

* wtf/MathExtras.h:
(WTF::countTrailingZeros):

Modified Paths

Diff

Modified: trunk/Source/_javascript_Core/ChangeLog (279192 => 279193)


--- trunk/Source/_javascript_Core/ChangeLog	2021-06-23 22:04:40 UTC (rev 279192)
+++ trunk/Source/_javascript_Core/ChangeLog	2021-06-23 22:09:31 UTC (rev 279193)
@@ -1,3 +1,113 @@
+2021-06-23  Yijia Huang  <[email protected]>
+
+        Add a new pattern to instruction selector to utilize UBFIZ supported by ARM64
+        https://bugs.webkit.org/show_bug.cgi?id=227204
+
+        Reviewed by Filip Pizlo.
+
+        This patch includes three parts:
+            A) Add UBFIZ to instruction selector.
+            B) Fix UBFX, introduced in https://bugs.webkit.org/show_bug.cgi?id=226984, 
+               to match all patterns. 
+            C) Fix error condition in one strength reduction introduced 
+               in https://bugs.webkit.org/show_bug.cgi?id=227138.
+
+        Part A
+        Unsigned Bitfield Insert in Zero(UBFIZ), supported by ARM64, zeros the 
+        destination register and copies any number of contiguous bits from a 
+        source register into any position in the destination register. The 
+        instruction selector can utilize this to lowering certain patterns in 
+        B3 IR before further Air optimization. 
+
+        Given the operation: ubfiz d, n, lsb, width
+
+        This is equivalent to "d = (n << lsb) & (((1 << width) - 1) << lsb)". 
+        Since wasm introduces constant folding, then the matched patterns would be:
+        1.1 d = (n << lsb) & maskShift
+        1.2 d = maskShift & (n << lsb)
+
+        2.1 d = (n & mask) << lsb
+        2.2 d = (mask & n) << lsb
+
+        Where:
+            maskShift = mask << lsb
+            mask = (1 << width) - 1
+
+        To make the pattern matching in instruction selection beneficial to JIT, these 
+        constraints should be introduced:
+            1. 0 <= lsb < datasize
+            2. 0 < width < datasize
+            3. lsb + width <= datasize
+
+        Choose (n & mask) << lsb as the canonical form and introduce a strength reduction.
+        Turn this: (n << lsb) & maskShift
+        Into this: (n & mask) << lsb
+
+        Given B3 IR:
+        Int @0 = ArgumentReg(%x0)
+        Int @1 = lsb
+        Int @2 = 0b0110
+        Int @3 = Shl(@0, @1)
+        Int @4 = BitAnd(@3, @2)  
+        Void@5 = Return(@4, Terminal)      
+
+        Before Adding UBFIZ Pattern:
+        // Old optimized AIR
+        Lshift  %x0, $62, %x0, @3
+        And  0b0110, %x0, %x0, @4
+        Ret     %x0,           @5
+
+        After Adding UBFIZ Pattern:
+        // New optimized AIR
+        Ubfiz %x0, lsb, 2, %x0, @4
+        Ret   %x0,              @5
+
+        Part B
+        Fix UBFX to match both patterns:
+        dest = (src >> lsb) & mask
+        dest = mask & (src >> lsb)
+
+        Where:
+        1. mask = (1 << width) - 1
+        2. 0 <= lsb < datasize
+        3. 0 < width < datasize
+        4. lsb + width <= datasize       
+
+        Part C
+        Fix one B3 strength reduction.
+        Turn this: (src >> shiftAmount) & mask
+        Into this: src >> shiftAmount
+
+        With updated constraints:
+        1. mask = (1 << width) - 1
+        2. 0 <= shiftAmount < datasize
+        3. 0 < width < datasize
+        4. shiftAmount + width >= datasize
+
+        * assembler/MacroAssemblerARM64.h:
+        (JSC::MacroAssemblerARM64::ubfiz32):
+        (JSC::MacroAssemblerARM64::ubfiz64):
+        * assembler/testmasm.cpp:
+        (JSC::testUbfiz32):
+        (JSC::testUbfiz64):
+        * b3/B3LowerToAir.cpp:
+        * b3/air/AirOpcode.opcodes:
+        * b3/testb3.h:
+        * b3/testb3_2.cpp:
+        (testUbfx32ArgLeft):
+        (testUbfx32ArgRight):
+        (testUbfx64ArgLeft):
+        (testUbfx64ArgRight):
+        (testUbfiz32ArgLeft):
+        (testUbfiz32ArgRight):
+        (testUbfiz64ArgLeft):
+        (testUbfiz64ArgRight):
+        (addBitTests):
+        (testUbfx32): Deleted.
+        (testUbfx32PatternMatch): Deleted.
+        (testUbfx64): Deleted.
+        (testUbfx64PatternMatch): Deleted.
+
 2021-06-23  Keith Miller  <[email protected]>
 
         add/removeManagedReference:withOwner: should have autoreleasepools

Modified: trunk/Source/_javascript_Core/assembler/MacroAssemblerARM64.h (279192 => 279193)


--- trunk/Source/_javascript_Core/assembler/MacroAssemblerARM64.h	2021-06-23 22:04:40 UTC (rev 279192)
+++ trunk/Source/_javascript_Core/assembler/MacroAssemblerARM64.h	2021-06-23 22:09:31 UTC (rev 279193)
@@ -400,6 +400,16 @@
         m_assembler.ubfx<64>(dest, src, lsb.m_value, width.m_value);
     }
 
+    void ubfiz32(RegisterID src, TrustedImm32 lsb, TrustedImm32 width, RegisterID dest)
+    {
+        m_assembler.ubfiz<32>(dest, src, lsb.m_value, width.m_value);
+    }
+
+    void ubfiz64(RegisterID src, TrustedImm32 lsb, TrustedImm32 width, RegisterID dest)
+    {
+        m_assembler.ubfiz<64>(dest, src, lsb.m_value, width.m_value);
+    }
+
     void and64(RegisterID src1, RegisterID src2, RegisterID dest)
     {
         m_assembler.and_<64>(dest, src1, src2);

Modified: trunk/Source/_javascript_Core/assembler/testmasm.cpp (279192 => 279193)


--- trunk/Source/_javascript_Core/assembler/testmasm.cpp	2021-06-23 22:04:40 UTC (rev 279192)
+++ trunk/Source/_javascript_Core/assembler/testmasm.cpp	2021-06-23 22:09:31 UTC (rev 279193)
@@ -1159,6 +1159,56 @@
         }
     }
 }
+
+void testUbfiz32()
+{
+    uint32_t src = ""
+    Vector<uint32_t> imms = { 0, 1, 30, 31, 32, 62, 63, 64 };
+    for (auto lsb : imms) {
+        for (auto width : imms) {
+            if (lsb >= 0 && width > 0 && lsb + width < 32) {
+                auto ubfiz32 = compile([=] (CCallHelpers& jit) {
+                    emitFunctionPrologue(jit);
+
+                    jit.ubfiz32(GPRInfo::returnValueGPR, 
+                        CCallHelpers::TrustedImm32(lsb), 
+                        CCallHelpers::TrustedImm32(width), 
+                        GPRInfo::returnValueGPR);
+
+                    emitFunctionEpilogue(jit);
+                    jit.ret();
+                });
+                uint32_t mask = (1U << width) - 1U;
+                CHECK_EQ(invoke<uint32_t>(ubfiz32, src), (src & mask) << lsb);
+            }
+        }
+    }
+}
+
+void testUbfiz64()
+{
+    uint64_t src = ""
+    Vector<uint32_t> imms = { 0, 1, 30, 31, 32, 62, 63, 64 };
+    for (auto lsb : imms) {
+        for (auto width : imms) {
+            if (lsb >= 0 && width > 0 && lsb + width < 64) {
+                auto ubfiz64 = compile([=] (CCallHelpers& jit) {
+                    emitFunctionPrologue(jit);
+
+                    jit.ubfiz64(GPRInfo::returnValueGPR, 
+                        CCallHelpers::TrustedImm32(lsb), 
+                        CCallHelpers::TrustedImm32(width), 
+                        GPRInfo::returnValueGPR);
+
+                    emitFunctionEpilogue(jit);
+                    jit.ret();
+                });
+                uint64_t mask = (1ULL << width) - 1ULL;
+                CHECK_EQ(invoke<uint64_t>(ubfiz64, src), (src & mask) << lsb);
+            }
+        }
+    }
+}
 #endif
 
 #if CPU(X86) || CPU(X86_64) || CPU(ARM64)
@@ -3328,6 +3378,8 @@
     RUN(testMulSubSignExtend32());
     RUN(testUbfx32());
     RUN(testUbfx64());
+    RUN(testUbfiz32());
+    RUN(testUbfiz64());
 #endif
 
 #if CPU(X86) || CPU(X86_64) || CPU(ARM64)

Modified: trunk/Source/_javascript_Core/b3/B3LowerToAir.cpp (279192 => 279193)


--- trunk/Source/_javascript_Core/b3/B3LowerToAir.cpp	2021-06-23 22:04:40 UTC (rev 279192)
+++ trunk/Source/_javascript_Core/b3/B3LowerToAir.cpp	2021-06-23 22:09:31 UTC (rev 279193)
@@ -2617,7 +2617,7 @@
                     Value* multiplyLeft = right->child(0);
                     Value* multiplyRight = right->child(1);
 
-                    // SMSUBL: d = a - SExt32(n) *  SExt32(m)
+                    // SMSUBL: d = a - SExt32(n) * SExt32(m)
                     if (multiplySubOpcode == MultiplySub64
                         && isValidForm(MultiplySubSignExtend32, Arg::Tmp, Arg::Tmp, Arg::Tmp, Arg::Tmp)) {
                         auto trySExt32 = [&] (Value* v) {
@@ -2761,27 +2761,36 @@
                 return;
             }
 
-            // UBFX Pattern: dest = (src >> lsb) & ((1 << width) - 1)
-            if (canBeInternal(left) && left->opcode() == ZShr) {
+            // UBFX Pattern: dest = (src >> lsb) & mask where mask = (1 << width) - 1
+            auto tryAppendUBFX = [&] () -> bool {
+                if (left->opcode() != ZShr || !canBeInternal(left))
+                    return false;
                 Value* srcValue = left->child(0);
                 Value* lsbValue = left->child(1);
-                if (!imm(srcValue) && imm(lsbValue) && right->hasInt()) {
-                    int64_t lsb = lsbValue->asInt();
-                    uint64_t mask = right->asInt();
-                    uint8_t width = static_cast<uint8_t>(!(mask & (mask + 1))) * WTF::bitCount(mask);
-                    Air::Opcode opcode = opcodeForType(Ubfx32, Ubfx64, srcValue->type());
-                    if (opcode
-                        && lsb >= 0
-                        && width > 0
-                        && lsb + width <= (32 << (opcode == Ubfx64))
-                        && isValidForm(opcode, Arg::Tmp, Arg::Imm, Arg::Imm, Arg::Tmp))  {
-                        append(opcode, tmp(srcValue), imm(lsbValue), imm(width), tmp(m_value));
-                        commitInternal(left);
-                        return;
-                    }
-                }
-            }
 
+                Air::Opcode opcode = opcodeForType(Ubfx32, Ubfx64, srcValue->type());
+                if (!isValidForm(opcode, Arg::Tmp, Arg::Imm, Arg::Imm, Arg::Tmp)) 
+                    return false;
+                if (!imm(lsbValue) || lsbValue->asInt() < 0 || !right->hasInt())
+                    return false;
+
+                uint64_t lsb = lsbValue->asInt();
+                uint64_t mask = right->asInt();
+                if (!mask || mask & (mask + 1))
+                    return false;
+                uint64_t width = WTF::bitCount(mask);
+                uint64_t datasize = opcode == Ubfx32 ? 32 : 64;
+                if (lsb + width > datasize)
+                    return false;
+
+                append(opcode, tmp(srcValue), imm(lsbValue), imm(width), tmp(m_value));
+                commitInternal(left);
+                return true;
+            };
+
+            if (tryAppendUBFX())
+                return;
+
             appendBinOp<And32, And64, AndDouble, AndFloat, Commutative>(left, right);
             return;
         }
@@ -2824,12 +2833,45 @@
         }
 
         case Shl: {
-            if (m_value->child(1)->isInt32(1)) {
-                appendBinOp<Add32, Add64, AddDouble, AddFloat, Commutative>(m_value->child(0), m_value->child(0));
+            Value* left = m_value->child(0);
+            Value* right = m_value->child(1);
+
+            // UBFIZ Pattern: d = (n & mask) << lsb where mask = (1 << width) - 1
+            auto tryAppendUBFZ = [&] () -> bool {
+                if (left->opcode() != BitAnd || !canBeInternal(left))
+                    return false;
+                Value* nValue = left->child(0);
+                Value* maskValue = left->child(1);
+
+                Air::Opcode opcode = opcodeForType(Ubfiz32, Ubfiz64, nValue->type());
+                if (!isValidForm(opcode, Arg::Tmp, Arg::Imm, Arg::Imm, Arg::Tmp)) 
+                    return false;
+                if (!maskValue->hasInt() || !imm(right) || right->asInt() < 0)
+                    return false;
+
+                uint64_t lsb = right->asInt();
+                uint64_t mask = maskValue->asInt();
+                if (!mask || mask & (mask + 1))
+                    return false;
+                uint64_t width = WTF::bitCount(mask);
+                uint64_t datasize = opcode == Ubfiz32 ? 32 : 64;
+                if (lsb + width > datasize)
+                    return false;
+
+                append(opcode, tmp(nValue), imm(right), imm(width), tmp(m_value));
+                commitInternal(left);
+                return true;
+            };
+
+            if (tryAppendUBFZ())
                 return;
+
+            if (right->isInt32(1)) {
+                appendBinOp<Add32, Add64, AddDouble, AddFloat, Commutative>(left, left);
+                return;
             }
-            
-            appendShift<Lshift32, Lshift64>(m_value->child(0), m_value->child(1));
+
+            appendShift<Lshift32, Lshift64>(left, right);
             return;
         }
 

Modified: trunk/Source/_javascript_Core/b3/B3ReduceStrength.cpp (279192 => 279193)


--- trunk/Source/_javascript_Core/b3/B3ReduceStrength.cpp	2021-06-23 22:04:40 UTC (rev 279192)
+++ trunk/Source/_javascript_Core/b3/B3ReduceStrength.cpp	2021-06-23 22:09:31 UTC (rev 279193)
@@ -42,6 +42,7 @@
 #include "B3ValueKeyInlines.h"
 #include "B3ValueInlines.h"
 #include <wtf/HashMap.h>
+#include <wtf/MathExtras.h>
 #include <wtf/StdLibExtras.h>
 
 namespace JSC { namespace B3 {
@@ -1037,22 +1038,59 @@
             }
 
             // Turn this: BitAnd(ZShr(value, shiftAmount), mask)
-            // - shiftAmount >= 0 and mask is contiguous ones from LSB, example 0b01111111
-            // - shiftAmount + bitCount(mask) == maxBitWidth
+            // Conditions:
+            // 1. mask = (1 << width) - 1
+            // 2. 0 <= shiftAmount < datasize
+            // 3. 0 < width < datasize
+            // 4. shiftAmount + width >= datasize
             // Into this: ZShr(value, shiftAmount)
             if (m_value->child(0)->opcode() == ZShr
                 && m_value->child(0)->child(1)->hasInt()
+                && m_value->child(0)->child(1)->asInt() >= 0
                 && m_value->child(1)->hasInt()) {
-                int64_t shiftAmount = m_value->child(0)->child(1)->asInt();
+                uint64_t shiftAmount = m_value->child(0)->child(1)->asInt();
                 uint64_t mask = m_value->child(1)->asInt();
-                bool isValid = mask && !(mask & (mask + 1));
-                uint64_t maxBitWidth = m_value->child(0)->child(0)->type() == Int64 ? 64 : 32;
-                if (shiftAmount >= 0 && isValid && static_cast<uint64_t>(shiftAmount + WTF::bitCount(mask)) == maxBitWidth) {
+                bool isValidMask = mask && !(mask & (mask + 1));
+                uint64_t datasize = m_value->child(0)->child(0)->type() == Int64 ? 64 : 32;
+                uint64_t width = WTF::bitCount(mask);
+                if (shiftAmount < datasize && isValidMask && shiftAmount + width >= datasize) {
                     replaceWithIdentity(m_value->child(0));
                     break;
                 }
             }
 
+            // Turn this: BitAnd(Shl(value, shiftAmount), maskShift)
+            // Conditions:
+            // 1. maskShift = mask << shiftAmount
+            // 2. mask = (1 << width) - 1
+            // 3. 0 <= shiftAmount < datasize
+            // 4. 0 < width < datasize
+            // 5. shiftAmount + width <= datasize
+            // Into this: Shl(BitAnd(value, mask), shiftAmount)
+            if (m_value->child(0)->opcode() == Shl
+                && m_value->child(0)->child(1)->hasInt()
+                && m_value->child(0)->child(1)->asInt() >= 0
+                && m_value->child(1)->hasInt()) {
+                uint64_t shiftAmount = m_value->child(0)->child(1)->asInt();
+                uint64_t maskShift = m_value->child(1)->asInt();
+                uint64_t maskShiftAmount = WTF::countTrailingZeros(maskShift);
+                uint64_t mask = maskShift >> maskShiftAmount;
+                uint64_t width = WTF::bitCount(mask);
+                uint64_t datasize = m_value->child(0)->child(0)->type() == Int64 ? 64 : 32;
+                bool isValidShiftAmount = shiftAmount == maskShiftAmount && shiftAmount < datasize;
+                bool isValidMask = mask && !(mask & (mask + 1)) && width < datasize;
+                if (isValidShiftAmount && isValidMask && shiftAmount + width <= datasize) {
+                    Value* maskValue;
+                    if (datasize == 32)
+                        maskValue = m_insertionSet.insert<Const32Value>(m_index, m_value->origin(), mask);
+                    else
+                        maskValue = m_insertionSet.insert<Const64Value>(m_index, m_value->origin(), mask);
+                    Value* bitAnd = m_insertionSet.insert<Value>(m_index, BitAnd, m_value->origin(), m_value->child(0)->child(0), maskValue);
+                    replaceWithNew<Value>(Shl, m_value->origin(), bitAnd, m_value->child(0)->child(1));
+                    break;
+                }
+            }
+
             // Turn this: BitAnd(value, all-ones)
             // Into this: value.
             if ((m_value->type() == Int64 && m_value->child(1)->isInt64(std::numeric_limits<uint64_t>::max()))

Modified: trunk/Source/_javascript_Core/b3/air/AirOpcode.opcodes (279192 => 279193)


--- trunk/Source/_javascript_Core/b3/air/AirOpcode.opcodes	2021-06-23 22:04:40 UTC (rev 279192)
+++ trunk/Source/_javascript_Core/b3/air/AirOpcode.opcodes	2021-06-23 22:09:31 UTC (rev 279193)
@@ -369,6 +369,12 @@
 arm64: Ubfx64 U:G:64, U:G:32, U:G:32, D:G:64
     Tmp, Imm, Imm, Tmp
 
+arm64: Ubfiz32 U:G:32, U:G:32, U:G:32, ZD:G:32
+    Tmp, Imm, Imm, Tmp
+
+arm64: Ubfiz64 U:G:64, U:G:32, U:G:32, D:G:64
+    Tmp, Imm, Imm, Tmp
+
 64: And64 U:G:64, U:G:64, D:G:64
     Tmp, Tmp, Tmp
     arm64: BitImm64, Tmp, Tmp

Modified: trunk/Source/_javascript_Core/b3/testb3.h (279192 => 279193)


--- trunk/Source/_javascript_Core/b3/testb3.h	2021-06-23 22:04:40 UTC (rev 279192)
+++ trunk/Source/_javascript_Core/b3/testb3.h	2021-06-23 22:09:31 UTC (rev 279193)
@@ -416,10 +416,18 @@
 
 void run(const char* filter);
 void testBitAndSExt32(int32_t value, int64_t mask);
-void testUbfx32();
-void testUbfx32PatternMatch();
-void testUbfx64();
-void testUbfx64PatternMatch();
+void testUbfx32ShiftAnd();
+void testUbfx32AndShift();
+void testUbfx64ShiftAnd();
+void testUbfx64AndShift();
+void testUbfiz32AndShiftValueMask();
+void testUbfiz32AndShiftMaskValue();
+void testUbfiz32ShiftAnd();
+void testUbfiz32AndShift();
+void testUbfiz64AndShiftValueMask();
+void testUbfiz64AndShiftMaskValue();
+void testUbfiz64ShiftAnd();
+void testUbfiz64AndShift();
 void testBitAndZeroShiftRightArgImmMask32();
 void testBitAndZeroShiftRightArgImmMask64();
 void testBasicSelect();

Modified: trunk/Source/_javascript_Core/b3/testb3_2.cpp (279192 => 279193)


--- trunk/Source/_javascript_Core/b3/testb3_2.cpp	2021-06-23 22:04:40 UTC (rev 279192)
+++ trunk/Source/_javascript_Core/b3/testb3_2.cpp	2021-06-23 22:09:31 UTC (rev 279193)
@@ -2662,12 +2662,15 @@
     CHECK(isIdentical(compileAndRun<float>(proc, bitwise_cast<int32_t>(a)), -a));
 }
 
-void testUbfx32()
+void testUbfx32ShiftAnd()
 {
-    // (src >> lsb) & mask
+    // Test Pattern: (src >> lsb) & mask
+    // where: mask = (1 << width) - 1
+    if (JSC::Options::defaultB3OptLevel() < 2)
+        return;
     uint32_t src = ""
-    Vector<uint32_t> lsbs = { 0, 15, 30 };
-    Vector<uint32_t> widths = { 30, 16, 1 };
+    Vector<uint32_t> lsbs = { 1, 14, 30 };
+    Vector<uint32_t> widths = { 30, 17, 1 };
 
     auto test = [&] (uint32_t lsb, uint32_t mask) -> uint32_t {
         Procedure proc;
@@ -2683,8 +2686,11 @@
         root->appendNewControlValue(
             proc, Return, Origin(),
             root->appendNew<Value>(proc, BitAnd, Origin(), left, maskValue));
-        
-        return compileAndRun<uint32_t>(proc, src);
+
+        auto code = compileProc(proc);
+        if (isARM64())
+            checkUsesInstruction(*code, "ubfx");
+        return invoke<uint32_t>(*code, src);
     };
 
     auto generateMask = [&] (uint32_t width) -> uint32_t {
@@ -2694,19 +2700,21 @@
     for (size_t i = 0; i < lsbs.size(); ++i) {
         uint32_t lsb = lsbs.at(i);
         uint32_t mask = generateMask(widths.at(i));
-        uint32_t lhs = test(lsb, mask);
-        uint32_t rhs = ((src >> lsb) & mask);
-        CHECK(lhs == rhs);
+        CHECK(test(lsb, mask) == ((src >> lsb) & mask));
     }
 }
 
-void testUbfx32PatternMatch()
+void testUbfx32AndShift()
 {
-    // (src >> lsb) & ((1 << width) - 1)
+    // Test Pattern: mask & (src >> lsb)
+    // Where: mask = (1 << width) - 1
+    if (JSC::Options::defaultB3OptLevel() < 2)
+        return;
     uint32_t src = ""
-    Vector<uint32_t> imms = { 0, 1, 30, 31, 32, 62, 63, 64 };
+    Vector<uint32_t> lsbs = { 1, 14, 30 };
+    Vector<uint32_t> widths = { 30, 17, 1 };
 
-    auto test = [&] (uint32_t lsb, uint32_t width) -> uint32_t {
+    auto test = [&] (uint32_t lsb, uint32_t mask) -> uint32_t {
         Procedure proc;
         BasicBlock* root = proc.addBlock();
 
@@ -2714,36 +2722,39 @@
             proc, Trunc, Origin(), 
             root->appendNew<ArgumentRegValue>(proc, Origin(), GPRInfo::argumentGPR0));
         Value* lsbValue = root->appendNew<Const32Value>(proc, Origin(), lsb);
-        Value* widthValue = root->appendNew<Const32Value>(proc, Origin(), width);
-        Value* constValueA = root->appendNew<Const32Value>(proc, Origin(), 1);
-        Value* constValueB = root->appendNew<Const32Value>(proc, Origin(), 1);
+        Value* maskValue = root->appendNew<Const32Value>(proc, Origin(), mask);
 
-        Value* left = root->appendNew<Value>(proc, ZShr, Origin(), srcValue, lsbValue);
-        Value* right = root->appendNew<Value>(
-            proc, Sub, Origin(), 
-            root->appendNew<Value>(proc, Shl, Origin(), constValueA, widthValue), constValueB);
+        Value* right = root->appendNew<Value>(proc, ZShr, Origin(), srcValue, lsbValue);
         root->appendNewControlValue(
             proc, Return, Origin(),
-            root->appendNew<Value>(proc, BitAnd, Origin(), left, right));
+            root->appendNew<Value>(proc, BitAnd, Origin(), maskValue, right));
 
-        return compileAndRun<uint32_t>(proc, src);
+        auto code = compileProc(proc);
+        if (isARM64())
+            checkUsesInstruction(*code, "ubfx");
+        return invoke<uint32_t>(*code, src);
     };
 
-    for (auto lsb : imms) {
-        for (auto width : imms) {
-            uint32_t lhs = test(lsb, width);
-            uint32_t rhs = ((src >> lsb) & ((1U << width) - 1U));
-            CHECK(lhs == rhs);
-        }
+    auto generateMask = [&] (uint32_t width) -> uint32_t {
+        return (1U << width) - 1U;
+    };
+
+    for (size_t i = 0; i < lsbs.size(); ++i) {
+        uint32_t lsb = lsbs.at(i);
+        uint32_t mask = generateMask(widths.at(i));
+        CHECK(test(lsb, mask) == (mask & (src >> lsb)));
     }
 }
 
-void testUbfx64()
+void testUbfx64ShiftAnd()
 {
-    // (src >> lsb) & mask
-    uint64_t src = ""
-    Vector<uint64_t> lsbs = { 0, 31, 62 };
-    Vector<uint64_t> widths = { 63, 32, 1 };
+    // Test Pattern: (src >> lsb) & mask
+    // where: mask = (1 << width) - 1
+    if (JSC::Options::defaultB3OptLevel() < 2)
+        return;
+    uint64_t src = ""
+    Vector<uint64_t> lsbs = { 1, 30, 62 };
+    Vector<uint64_t> widths = { 62, 33, 1 };
 
     auto test = [&] (uint64_t lsb, uint64_t mask) -> uint64_t {
         Procedure proc;
@@ -2758,7 +2769,10 @@
             proc, Return, Origin(),
             root->appendNew<Value>(proc, BitAnd, Origin(), left, maskValue));
 
-        return compileAndRun<uint64_t>(proc, src);
+        auto code = compileProc(proc);
+        if (isARM64())
+            checkUsesInstruction(*code, "ubfx");
+        return invoke<uint64_t>(*code, src);
     };
 
     auto generateMask = [&] (uint64_t width) -> uint64_t {
@@ -2768,55 +2782,389 @@
     for (size_t i = 0; i < lsbs.size(); ++i) {
         uint64_t lsb = lsbs.at(i);
         uint64_t mask = generateMask(widths.at(i));
-        uint64_t lhs = test(lsb, mask);
-        uint64_t rhs = ((src >> lsb) & mask);
-        CHECK(lhs == rhs);
+        CHECK(test(lsb, mask) == ((src >> lsb) & mask));
     }
 }
 
-void testUbfx64PatternMatch()
+void testUbfx64AndShift()
 {
-    // (src >> lsb) & ((1 << width) - 1)
+    // Test Pattern: mask & (src >> lsb)
+    // Where: mask = (1 << width) - 1
+    if (JSC::Options::defaultB3OptLevel() < 2)
+        return;
     uint64_t src = ""
-    Vector<uint32_t> imms = { 0, 1, 30, 31, 32, 62, 63, 64 };
+    Vector<uint64_t> lsbs = { 1, 30, 62 };
+    Vector<uint64_t> widths = { 62, 33, 1 };
 
-    auto test = [&] (uint32_t lsb, uint32_t width) -> uint64_t {
+    auto test = [&] (uint64_t lsb, uint64_t mask) -> uint64_t {
         Procedure proc;
         BasicBlock* root = proc.addBlock();
 
         Value* srcValue = root->appendNew<ArgumentRegValue>(proc, Origin(), GPRInfo::argumentGPR0);
         Value* lsbValue = root->appendNew<Const32Value>(proc, Origin(), lsb);
-        Value* widthValue = root->appendNew<Const32Value>(proc, Origin(), width);
-        Value* constValueA = root->appendNew<Const64Value>(proc, Origin(), 1);
-        Value* constValueB = root->appendNew<Const64Value>(proc, Origin(), 1);
+        Value* maskValue = root->appendNew<Const64Value>(proc, Origin(), mask);
 
-        Value* left = root->appendNew<Value>(proc, ZShr, Origin(), srcValue, lsbValue);
-        Value* right = root->appendNew<Value>(
-            proc, Sub, Origin(), 
-            root->appendNew<Value>(proc, Shl, Origin(), constValueA, widthValue), constValueB);
+        Value* right = root->appendNew<Value>(proc, ZShr, Origin(), srcValue, lsbValue);
         root->appendNewControlValue(
             proc, Return, Origin(),
-            root->appendNew<Value>(proc, BitAnd, Origin(), left, right));
+            root->appendNew<Value>(proc, BitAnd, Origin(), maskValue, right));
 
-        return compileAndRun<uint64_t>(proc, src);
+        auto code = compileProc(proc);
+        if (isARM64())
+            checkUsesInstruction(*code, "ubfx");
+        return invoke<uint64_t>(*code, src);
     };
 
-    for (auto lsb : imms) {
-        for (auto width : imms) {
-            uint64_t lhs = test(lsb, width);
-            uint64_t rhs = ((src >> lsb) & ((1ULL << width) - 1ULL));
-            CHECK(lhs == rhs);
-        }
+    auto generateMask = [&] (uint64_t width) -> uint64_t {
+        return (1ULL << width) - 1ULL;
+    };
+
+    for (size_t i = 0; i < lsbs.size(); ++i) {
+        uint64_t lsb = lsbs.at(i);
+        uint64_t mask = generateMask(widths.at(i));
+        CHECK(test(lsb, mask) == (mask & (src >> lsb)));
     }
 }
 
+void testUbfiz32AndShiftValueMask()
+{
+    // Test Pattern: d = (n & mask) << lsb 
+    // Where: mask = (1 << width) - 1
+    if (JSC::Options::defaultB3OptLevel() < 2)
+        return;
+    uint32_t n = 0xffffffff;
+    Vector<uint32_t> lsbs = { 1, 14, 30 };
+    Vector<uint32_t> widths = { 30, 17, 1 };
+
+    auto test = [&] (uint32_t lsb, uint32_t mask) -> uint32_t {
+        Procedure proc;
+        BasicBlock* root = proc.addBlock();
+
+        Value* nValue = root->appendNew<Value>(
+            proc, Trunc, Origin(), 
+            root->appendNew<ArgumentRegValue>(proc, Origin(), GPRInfo::argumentGPR0));
+        Value* lsbValue = root->appendNew<Const32Value>(proc, Origin(), lsb);
+        Value* maskValue = root->appendNew<Const32Value>(proc, Origin(), mask);
+
+        Value* left = root->appendNew<Value>(proc, BitAnd, Origin(), nValue, maskValue);
+        root->appendNewControlValue(
+            proc, Return, Origin(),
+            root->appendNew<Value>(proc, Shl, Origin(), left, lsbValue));
+
+        auto code = compileProc(proc);
+        if (isARM64())
+            checkUsesInstruction(*code, "ubfiz");
+        return invoke<uint32_t>(*code, n);
+    };
+
+    auto generateMask = [&] (uint32_t width) -> uint32_t {
+        return (1U << width) - 1U;
+    };
+
+    for (size_t i = 0; i < lsbs.size(); ++i) {
+        uint32_t lsb = lsbs.at(i);
+        uint32_t mask = generateMask(widths.at(i));
+        CHECK(test(lsb, mask) == ((n & mask) << lsb));
+    }
+}
+
+void testUbfiz32AndShiftMaskValue()
+{
+    // Test Pattern: d = (mask & n) << lsb 
+    // Where: mask = (1 << width) - 1
+    if (JSC::Options::defaultB3OptLevel() < 2)
+        return;
+    uint32_t n = 0xffffffff;
+    Vector<uint32_t> lsbs = { 1, 14, 30 };
+    Vector<uint32_t> widths = { 30, 17, 1 };
+
+    auto test = [&] (uint32_t lsb, uint32_t mask) -> uint32_t {
+        Procedure proc;
+        BasicBlock* root = proc.addBlock();
+
+        Value* nValue = root->appendNew<Value>(
+            proc, Trunc, Origin(), 
+            root->appendNew<ArgumentRegValue>(proc, Origin(), GPRInfo::argumentGPR0));
+        Value* lsbValue = root->appendNew<Const32Value>(proc, Origin(), lsb);
+        Value* maskValue = root->appendNew<Const32Value>(proc, Origin(), mask);
+
+        Value* left = root->appendNew<Value>(proc, BitAnd, Origin(), maskValue, nValue);
+        root->appendNewControlValue(
+            proc, Return, Origin(),
+            root->appendNew<Value>(proc, Shl, Origin(), left, lsbValue));
+
+        auto code = compileProc(proc);
+        if (isARM64())
+            checkUsesInstruction(*code, "ubfiz");
+        return invoke<uint32_t>(*code, n);
+    };
+
+    auto generateMask = [&] (uint32_t width) -> uint32_t {
+        return (1U << width) - 1U;
+    };
+
+    for (size_t i = 0; i < lsbs.size(); ++i) {
+        uint32_t lsb = lsbs.at(i);
+        uint32_t mask = generateMask(widths.at(i));
+        CHECK(test(lsb, mask) == ((mask & n) << lsb));
+    }
+}
+
+void testUbfiz32ShiftAnd()
+{
+    // Test Pattern: d = (n << lsb) & maskShift
+    // Where: maskShift = mask << lsb
+    //        mask = (1 << width) - 1
+    if (JSC::Options::defaultB3OptLevel() < 2)
+        return;
+    uint32_t n = 0xffffffff;
+    Vector<uint32_t> lsbs = { 1, 14, 30 };
+    Vector<uint32_t> widths = { 30, 17, 1 };
+
+    auto test = [&] (uint32_t lsb, uint32_t maskShift) -> uint32_t {
+        Procedure proc;
+        BasicBlock* root = proc.addBlock();
+
+        Value* nValue = root->appendNew<Value>(
+            proc, Trunc, Origin(), 
+            root->appendNew<ArgumentRegValue>(proc, Origin(), GPRInfo::argumentGPR0));
+        Value* lsbValue = root->appendNew<Const32Value>(proc, Origin(), lsb);
+        Value* maskShiftValue = root->appendNew<Const32Value>(proc, Origin(), maskShift);
+
+        Value* left = root->appendNew<Value>(proc, Shl, Origin(), nValue, lsbValue);
+        root->appendNewControlValue(
+            proc, Return, Origin(),
+            root->appendNew<Value>(proc, BitAnd, Origin(), left, maskShiftValue));
+
+        auto code = compileProc(proc);
+        if (isARM64())
+            checkUsesInstruction(*code, "ubfiz");
+        return invoke<uint32_t>(*code, n);
+    };
+
+    auto generateMaskShift = [&] (uint32_t width, uint32_t lsb) -> uint32_t {
+        return ((1U << width) - 1U) << lsb;
+    };
+
+    for (size_t i = 0; i < lsbs.size(); ++i) {
+        uint32_t lsb = lsbs.at(i);
+        uint32_t maskShift = generateMaskShift(widths.at(i), lsb);
+        CHECK(test(lsb, maskShift) == ((n << lsb) & maskShift));
+    }
+}
+
+void testUbfiz32AndShift()
+{
+    // Test Pattern: d = maskShift & (n << lsb)
+    // Where: maskShift = mask << lsb
+    //        mask = (1 << width) - 1
+    if (JSC::Options::defaultB3OptLevel() < 2)
+        return;
+    uint32_t n = 0xffffffff;
+    Vector<uint32_t> lsbs = { 1, 14, 30 };
+    Vector<uint32_t> widths = { 30, 17, 1 };
+
+    auto test = [&] (uint32_t lsb, uint32_t maskShift) -> uint32_t {
+        Procedure proc;
+        BasicBlock* root = proc.addBlock();
+
+        Value* nValue = root->appendNew<Value>(
+            proc, Trunc, Origin(), 
+            root->appendNew<ArgumentRegValue>(proc, Origin(), GPRInfo::argumentGPR0));
+        Value* lsbValue = root->appendNew<Const32Value>(proc, Origin(), lsb);
+        Value* maskShiftValue = root->appendNew<Const32Value>(proc, Origin(), maskShift);
+
+        Value* right = root->appendNew<Value>(proc, Shl, Origin(), nValue, lsbValue);
+        root->appendNewControlValue(
+            proc, Return, Origin(),
+            root->appendNew<Value>(proc, BitAnd, Origin(), maskShiftValue, right));
+
+        auto code = compileProc(proc);
+        if (isARM64())
+            checkUsesInstruction(*code, "ubfiz");
+        return invoke<uint32_t>(*code, n);
+    };
+
+    auto generateMaskShift = [&] (uint32_t width, uint32_t lsb) -> uint32_t {
+        return ((1U << width) - 1U) << lsb;
+    };
+
+    for (size_t i = 0; i < lsbs.size(); ++i) {
+        uint32_t lsb = lsbs.at(i);
+        uint32_t maskShift = generateMaskShift(widths.at(i), lsb);
+        CHECK(test(lsb, maskShift) == (maskShift & (n << lsb)));
+    }
+}
+
+void testUbfiz64AndShiftValueMask()
+{
+    // Test Pattern: d = (n & mask) << lsb 
+    // Where: mask = (1 << width) - 1
+    if (JSC::Options::defaultB3OptLevel() < 2)
+        return;
+    uint64_t n = 0xffffffffffffffff;
+    Vector<uint64_t> lsbs = { 1, 30, 62 };
+    Vector<uint64_t> widths = { 62, 33, 1 };
+
+    auto test = [&] (uint64_t lsb, uint64_t mask) -> uint64_t {
+        Procedure proc;
+        BasicBlock* root = proc.addBlock();
+
+        Value* nValue = root->appendNew<ArgumentRegValue>(proc, Origin(), GPRInfo::argumentGPR0);
+        Value* lsbValue = root->appendNew<Const32Value>(proc, Origin(), lsb);
+        Value* maskValue = root->appendNew<Const64Value>(proc, Origin(), mask);
+
+        Value* left = root->appendNew<Value>(proc, BitAnd, Origin(), nValue, maskValue);
+        root->appendNewControlValue(
+            proc, Return, Origin(),
+            root->appendNew<Value>(proc, Shl, Origin(), left, lsbValue));
+
+        auto code = compileProc(proc);
+        if (isARM64())
+            checkUsesInstruction(*code, "ubfiz");
+        return invoke<uint64_t>(*code, n);
+    };
+
+    auto generateMask = [&] (uint64_t width) -> uint64_t {
+        return (1ULL << width) - 1ULL;
+    };
+
+    for (size_t i = 0; i < lsbs.size(); ++i) {
+        uint64_t lsb = lsbs.at(i);
+        uint64_t mask = generateMask(widths.at(i));
+        CHECK(test(lsb, mask) == ((n & mask) << lsb));
+    }
+}
+
+void testUbfiz64AndShiftMaskValue()
+{
+    // Test Pattern: d = (mask & n) << lsb 
+    // Where: mask = (1 << width) - 1
+    if (JSC::Options::defaultB3OptLevel() < 2)
+        return;
+    uint64_t n = 0xffffffffffffffff;
+    Vector<uint64_t> lsbs = { 1, 30, 62 };
+    Vector<uint64_t> widths = { 62, 33, 1 };
+
+    auto test = [&] (uint64_t lsb, uint64_t mask) -> uint64_t {
+        Procedure proc;
+        BasicBlock* root = proc.addBlock();
+
+        Value* nValue = root->appendNew<ArgumentRegValue>(proc, Origin(), GPRInfo::argumentGPR0);
+        Value* lsbValue = root->appendNew<Const32Value>(proc, Origin(), lsb);
+        Value* maskValue = root->appendNew<Const64Value>(proc, Origin(), mask);
+
+        Value* left = root->appendNew<Value>(proc, BitAnd, Origin(), maskValue, nValue);
+        root->appendNewControlValue(
+            proc, Return, Origin(),
+            root->appendNew<Value>(proc, Shl, Origin(), left, lsbValue));
+
+        auto code = compileProc(proc);
+        if (isARM64())
+            checkUsesInstruction(*code, "ubfiz");
+        return invoke<uint64_t>(*code, n);
+    };
+
+    auto generateMask = [&] (uint64_t width) -> uint64_t {
+        return (1ULL << width) - 1ULL;
+    };
+
+    for (size_t i = 0; i < lsbs.size(); ++i) {
+        uint64_t lsb = lsbs.at(i);
+        uint64_t mask = generateMask(widths.at(i));
+        CHECK(test(lsb, mask) == ((mask & n) << lsb));
+    }
+}
+
+void testUbfiz64ShiftAnd()
+{
+    // Test Pattern: d = (n << lsb) & maskShift
+    // Where: maskShift = mask << lsb
+    //        mask = (1 << width) - 1
+    if (JSC::Options::defaultB3OptLevel() < 2)
+        return;
+    uint64_t n = 0xffffffffffffffff;
+    Vector<uint64_t> lsbs = { 1, 30, 62 };
+    Vector<uint64_t> widths = { 62, 33, 1 };
+
+    auto test = [&] (uint64_t lsb, uint64_t maskShift) -> uint64_t {
+        Procedure proc;
+        BasicBlock* root = proc.addBlock();
+
+        Value* nValue = root->appendNew<ArgumentRegValue>(proc, Origin(), GPRInfo::argumentGPR0);
+        Value* lsbValue = root->appendNew<Const32Value>(proc, Origin(), lsb);
+        Value* maskShiftValue = root->appendNew<Const64Value>(proc, Origin(), maskShift);
+
+        Value* left = root->appendNew<Value>(proc, Shl, Origin(), nValue, lsbValue);
+        root->appendNewControlValue(
+            proc, Return, Origin(),
+            root->appendNew<Value>(proc, BitAnd, Origin(), left, maskShiftValue));
+
+        auto code = compileProc(proc);
+        if (isARM64())
+            checkUsesInstruction(*code, "ubfiz");
+        return invoke<uint64_t>(*code, n);
+    };
+
+    auto generateMaskShift = [&] (uint64_t width, uint64_t lsb) -> uint64_t {
+        return ((1ULL << width) - 1ULL) << lsb;
+    };
+
+    for (size_t i = 0; i < lsbs.size(); ++i) {
+        uint64_t lsb = lsbs.at(i);
+        uint64_t maskShift = generateMaskShift(widths.at(i), lsb);
+        CHECK(test(lsb, maskShift) == ((n << lsb) & maskShift));
+    }
+}
+
+void testUbfiz64AndShift()
+{
+    // Test Pattern: d = maskShift & (n << lsb)
+    // Where: maskShift = mask << lsb
+    //        mask = (1 << width) - 1
+    if (JSC::Options::defaultB3OptLevel() < 2)
+        return;
+    uint64_t n = 0xffffffffffffffff;
+    Vector<uint64_t> lsbs = { 1, 30, 62 };
+    Vector<uint64_t> widths = { 62, 33, 1 };
+
+    auto test = [&] (uint64_t lsb, uint64_t maskShift) -> uint64_t {
+        Procedure proc;
+        BasicBlock* root = proc.addBlock();
+
+        Value* nValue = root->appendNew<ArgumentRegValue>(proc, Origin(), GPRInfo::argumentGPR0);
+        Value* lsbValue = root->appendNew<Const32Value>(proc, Origin(), lsb);
+        Value* maskShiftValue = root->appendNew<Const64Value>(proc, Origin(), maskShift);
+
+        Value* right = root->appendNew<Value>(proc, Shl, Origin(), nValue, lsbValue);
+        root->appendNewControlValue(
+            proc, Return, Origin(),
+            root->appendNew<Value>(proc, BitAnd, Origin(), maskShiftValue, right));
+
+        auto code = compileProc(proc);
+        if (isARM64())
+            checkUsesInstruction(*code, "ubfiz");
+        return invoke<uint64_t>(*code, n);
+    };
+
+    auto generateMaskShift = [&] (uint64_t width, uint64_t lsb) -> uint64_t {
+        return ((1ULL << width) - 1ULL) << lsb;
+    };
+
+    for (size_t i = 0; i < lsbs.size(); ++i) {
+        uint64_t lsb = lsbs.at(i);
+        uint64_t maskShift = generateMaskShift(widths.at(i), lsb);
+        CHECK(test(lsb, maskShift) == (maskShift & (n << lsb)));
+    }
+}
+
 void testBitAndZeroShiftRightArgImmMask32()
 {
     // Turn this: (tmp >> imm) & mask 
     // Into this: tmp >> imm
     uint32_t tmp = 0xffffffff;
-    Vector<uint32_t> imms = { 4, 28 };
-    Vector<uint32_t> masks = { 0x0fffffff, 0xf };
+    Vector<uint32_t> imms = { 4, 28, 31 };
+    Vector<uint32_t> masks = { 0x0fffffff, 0xf, 0xffff };
 
     auto test = [&] (uint32_t imm, uint32_t mask) {
         Procedure proc;
@@ -2833,11 +3181,8 @@
             root->appendNew<Value>(proc, BitAnd, Origin(), leftValue, rightValue));
 
         auto code = compileProc(proc);
-        if (isARM64()) {
+        if (isARM64())
             checkUsesInstruction(*code, "lsr");
-            checkDoesNotUseInstruction(*code, "and");
-            checkDoesNotUseInstruction(*code, "ubfx");
-        }
         uint32_t lhs = invoke<uint32_t>(*code, tmp);
         uint32_t rhs = tmp >> imm;
         CHECK(lhs == rhs);
@@ -2852,8 +3197,8 @@
     // Turn this: (tmp >> imm) & mask 
     // Into this: tmp >> imm
     uint64_t tmp = 0xffffffffffffffff;
-    Vector<uint64_t> imms = { 4, 60 };
-    Vector<uint64_t> masks = { 0x0fffffffffffffff, 0xf };
+    Vector<uint64_t> imms = { 4, 60, 63 };
+    Vector<uint64_t> masks = { 0x0fffffffffffffff, 0xf, 0xffff };
 
     auto test = [&] (uint64_t imm, uint64_t mask) {
         Procedure proc;
@@ -2868,11 +3213,8 @@
             root->appendNew<Value>(proc, BitAnd, Origin(), leftValue, rightValue));
 
         auto code = compileProc(proc);
-        if (isARM64()) {
+        if (isARM64())
             checkUsesInstruction(*code, "lsr");
-            checkDoesNotUseInstruction(*code, "and");
-            checkDoesNotUseInstruction(*code, "ubfx");
-        }
         uint64_t lhs = invoke<uint64_t>(*code, tmp);
         uint64_t rhs = tmp >> imm;
         CHECK(lhs == rhs);
@@ -3696,10 +4038,18 @@
 
 void addBitTests(const char* filter, Deque<RefPtr<SharedTask<void()>>>& tasks)
 {
-    RUN(testUbfx32());
-    RUN(testUbfx32PatternMatch());
-    RUN(testUbfx64());
-    RUN(testUbfx64PatternMatch());
+    RUN(testUbfx32ShiftAnd());
+    RUN(testUbfx32AndShift());
+    RUN(testUbfx64ShiftAnd());
+    RUN(testUbfx64AndShift());
+    RUN(testUbfiz32AndShiftValueMask());
+    RUN(testUbfiz32AndShiftMaskValue());
+    RUN(testUbfiz32ShiftAnd());
+    RUN(testUbfiz32AndShift());
+    RUN(testUbfiz64AndShiftValueMask());
+    RUN(testUbfiz64AndShiftMaskValue());
+    RUN(testUbfiz64ShiftAnd());
+    RUN(testUbfiz64AndShift());
     RUN(testBitAndZeroShiftRightArgImmMask32());
     RUN(testBitAndZeroShiftRightArgImmMask64());
     RUN(testBitAndArgs(43, 43));

Modified: trunk/Source/WTF/ChangeLog (279192 => 279193)


--- trunk/Source/WTF/ChangeLog	2021-06-23 22:04:40 UTC (rev 279192)
+++ trunk/Source/WTF/ChangeLog	2021-06-23 22:09:31 UTC (rev 279193)
@@ -1,3 +1,16 @@
+2021-06-23  Yijia Huang  <[email protected]>
+
+        Add a new pattern to instruction selector to utilize UBFIZ supported by ARM64
+        https://bugs.webkit.org/show_bug.cgi?id=227204
+
+        Reviewed by Filip Pizlo.
+
+        Add functions to count the consecutive zero bits (trailing) on the 
+        right with modulus division and lookup. Reference: Bit Twiddling Hacks.
+
+        * wtf/MathExtras.h:
+        (WTF::countTrailingZeros):
+
 2021-06-23  Kate Cheney  <[email protected]>
 
         Migrate App Privacy Report code from WebKitAdditions

Modified: trunk/Source/WTF/wtf/MathExtras.h (279192 => 279193)


--- trunk/Source/WTF/wtf/MathExtras.h	2021-06-23 22:04:40 UTC (rev 279192)
+++ trunk/Source/WTF/wtf/MathExtras.h	2021-06-23 22:09:31 UTC (rev 279193)
@@ -728,6 +728,28 @@
     return bitSize - 1 - clzConstexpr(t);
 }
 
+inline size_t countTrailingZeros(uint32_t v)
+{
+    static const unsigned Mod37BitPosition[] = {
+        32, 0, 1, 26, 2, 23, 27, 0, 3, 16, 24, 30, 28, 11, 0, 13,
+        4, 7, 17, 0, 25, 22, 31, 15, 29, 10, 12, 6, 0, 21, 14, 9,
+        5, 20, 8, 19, 18
+    };
+    return Mod37BitPosition[((1 + ~v) & v) % 37];
+}
+
+inline size_t countTrailingZeros(uint64_t v)
+{
+    static const unsigned Mod67Position[] = {
+        64, 0, 1, 39, 2, 15, 40, 23, 3, 12, 16, 59, 41, 19, 24, 54,
+        4, 64, 13, 10, 17, 62, 60, 28, 42, 30, 20, 51, 25, 44, 55,
+        47, 5, 32, 65, 38, 14, 22, 11, 58, 18, 53, 63, 9, 61, 27,
+        29, 50, 43, 46, 31, 37, 21, 57, 52, 8, 26, 49, 45, 36, 56,
+        7, 48, 35, 6, 34, 33, 0
+    };
+    return Mod67Position[((1 + ~v) & v) % 67];
+}
+
 } // namespace WTF
 
 using WTF::shuffleVector;
_______________________________________________
webkit-changes mailing list
[email protected]
https://lists.webkit.org/mailman/listinfo/webkit-changes

Reply via email to