Title: [286424] trunk/Source/_javascript_Core
Revision
286424
Author
[email protected]
Date
2021-12-02 06:26:22 -0800 (Thu, 02 Dec 2021)

Log Message

[JSC] Generated code size reductions for baseline JIT (all architectures)
https://bugs.webkit.org/show_bug.cgi?id=233474

Patch by Geza Lore <[email protected]> on 2021-12-02
Reviewed by Yusuke Suzuki.

This patch introduces a few improvements that reduce the generated
code size.

Target independent improvements to the Baseline JIT:

1. Some bytecodes that are very frequent (e.g.: get_by_id, call) share
the same instructions at the tail end of the fast and slow paths.
Instead of duplicating these in the slow path, then branch to the next
sequential bytecode on the fast path, make the slow path branch to and
reuse these common instructions, which then naturally fall through to
the next sequential bytecode.

2. Minor tweaks in a few places to remove redundant reloading of
immediates and remove redundant moves.

3. Remove a small number of redundant unconditional branches from some
DataIC fast paths.

ARMv7/Thumb-2 specific improvements:

4. Add assembler support for LDRD and STRD (load/store a pair of
32-bit GPRs) and use them throughout via loadValue/storeValue. This
yields denser code as it often eliminates repeated temporary register
setups (especially for a BaseIndex access), and also due to point 4
below. This is also potentially a performance improvement on
micro-architectures with a 64-bit LSU data-path.

5. Instructions using only r0-r7 as operands can often use a short,
16-bit encoding in Thumb-2, so prefer to use low order registers
as temporaries wherever possible.

The net effect of this patch is that the emitted baseline code during
a run of JetStream2 is ~6.6% smaller on x86_64, ~5.1% smaller on
ARM64, and ~24% smaller on ARMv7/Thumb-2. On ARMv7/Thumb-2, DFG code
is also ~5.3% smaller, while on other architectures the DFG code is
unaffected.

On ARMv7/Thumb-2, this patch also yields an ~2% improvement in
JetStream2 scores on my test machine.

* assembler/ARMv7Assembler.h:
(JSC::ARMv7Assembler::ldrd):
(JSC::ARMv7Assembler::strd):
(JSC::ARMv7Assembler::ARMInstructionFormatter::twoWordOp12Reg4Reg4Reg4Imm8):
* assembler/MacroAssembler.h:
(JSC::MacroAssembler::addPtr):
* assembler/MacroAssemblerARMv7.h:
(JSC::MacroAssemblerARMv7::bestTempRegister):
(JSC::MacroAssemblerARMv7::scratchRegister):
(JSC::MacroAssemblerARMv7::add32):
(JSC::MacroAssemblerARMv7::sub32):
(JSC::MacroAssemblerARMv7::loadPair32):
(JSC::MacroAssemblerARMv7::store32):
(JSC::MacroAssemblerARMv7::storePair32):
(JSC::MacroAssemblerARMv7::compare32AndSetFlags):
(JSC::MacroAssemblerARMv7::test32):
(JSC::MacroAssemblerARMv7::branch32):
(JSC::MacroAssemblerARMv7::farJump):
(JSC::MacroAssemblerARMv7::call):
(JSC::MacroAssemblerARMv7::compare32):
* assembler/MacroAssemblerMIPS.h:
(JSC::MacroAssemblerMIPS::loadPair32):
(JSC::MacroAssemblerMIPS::storePair32):
* jit/AssemblyHelpers.h:
(JSC::AssemblyHelpers::storeValue):
(JSC::AssemblyHelpers::loadValue):
* jit/JIT.cpp:
(JSC::JIT::privateCompileSlowCases):
* jit/JIT.h:
* jit/JITCall.cpp:
(JSC::JIT::compileCallEval):
(JSC::JIT::compileCallEvalSlowCase):
(JSC::JIT::compileOpCall):
(JSC::JIT::compileOpCallSlowCase):
(JSC::JIT::emitSlow_op_iterator_open):
(JSC::JIT::emitSlow_op_iterator_next):
* jit/JITInlineCacheGenerator.cpp:
(JSC::generateGetByIdInlineAccess):
(JSC::JITPutByIdGenerator::generateBaselineDataICFastPath):
* jit/JITInlineCacheGenerator.h:
* jit/JITInlines.h:
(JSC::JIT::setFastPathResumePoint):
(JSC::JIT::fastPathResumePoint const):
* jit/JITOpcodes.cpp:
(JSC::JIT::emit_op_enter):
* jit/JITPropertyAccess.cpp:
(JSC::JIT::emit_op_get_by_val):
(JSC::JIT::generateGetByValSlowCase):
(JSC::JIT::emit_op_get_private_name):
(JSC::JIT::emitSlow_op_get_private_name):
(JSC::JIT::emit_op_try_get_by_id):
(JSC::JIT::emitSlow_op_try_get_by_id):
(JSC::JIT::emit_op_get_by_id_direct):
(JSC::JIT::emitSlow_op_get_by_id_direct):
(JSC::JIT::emit_op_get_by_id):
(JSC::JIT::emitSlow_op_get_by_id):
(JSC::JIT::emit_op_get_by_id_with_this):
(JSC::JIT::emitSlow_op_get_by_id_with_this):
(JSC::JIT::emit_op_in_by_id):
(JSC::JIT::emitSlow_op_in_by_id):
(JSC::JIT::emit_op_in_by_val):
(JSC::JIT::emitSlow_op_in_by_val):
(JSC::JIT::emitHasPrivate):
(JSC::JIT::emitHasPrivateSlow):
(JSC::JIT::emitSlow_op_has_private_name):
(JSC::JIT::emitSlow_op_has_private_brand):
(JSC::JIT::emit_op_enumerator_get_by_val):
(JSC::JIT::emitWriteBarrier):

Modified Paths

Diff

Modified: trunk/Source/_javascript_Core/ChangeLog (286423 => 286424)


--- trunk/Source/_javascript_Core/ChangeLog	2021-12-02 14:15:29 UTC (rev 286423)
+++ trunk/Source/_javascript_Core/ChangeLog	2021-12-02 14:26:22 UTC (rev 286424)
@@ -1,3 +1,119 @@
+2021-12-02  Geza Lore  <[email protected]>
+
+        [JSC] Generated code size reductions for baseline JIT (all architectures)
+        https://bugs.webkit.org/show_bug.cgi?id=233474
+
+        Reviewed by Yusuke Suzuki.
+
+        This patch introduces a few improvements that reduce the generated
+        code size.
+
+        Target independent improvements to the Baseline JIT:
+
+        1. Some bytecodes that are very frequent (e.g.: get_by_id, call) share
+        the same instructions at the tail end of the fast and slow paths.
+        Instead of duplicating these in the slow path, then branch to the next
+        sequential bytecode on the fast path, make the slow path branch to and
+        reuse these common instructions, which then naturally fall through to
+        the next sequential bytecode.
+
+        2. Minor tweaks in a few places to remove redundant reloading of
+        immediates and remove redundant moves.
+
+        3. Remove a small number of redundant unconditional branches from some
+        DataIC fast paths.
+
+        ARMv7/Thumb-2 specific improvements:
+
+        4. Add assembler support for LDRD and STRD (load/store a pair of
+        32-bit GPRs) and use them throughout via loadValue/storeValue. This
+        yields denser code as it often eliminates repeated temporary register
+        setups (especially for a BaseIndex access), and also due to point 4
+        below. This is also potentially a performance improvement on
+        micro-architectures with a 64-bit LSU data-path.
+
+        5. Instructions using only r0-r7 as operands can often use a short,
+        16-bit encoding in Thumb-2, so prefer to use low order registers
+        as temporaries wherever possible.
+
+        The net effect of this patch is that the emitted baseline code during
+        a run of JetStream2 is ~6.6% smaller on x86_64, ~5.1% smaller on
+        ARM64, and ~24% smaller on ARMv7/Thumb-2. On ARMv7/Thumb-2, DFG code
+        is also ~5.3% smaller, while on other architectures the DFG code is
+        unaffected.
+
+        On ARMv7/Thumb-2, this patch also yields an ~2% improvement in
+        JetStream2 scores on my test machine.
+
+        * assembler/ARMv7Assembler.h:
+        (JSC::ARMv7Assembler::ldrd):
+        (JSC::ARMv7Assembler::strd):
+        (JSC::ARMv7Assembler::ARMInstructionFormatter::twoWordOp12Reg4Reg4Reg4Imm8):
+        * assembler/MacroAssembler.h:
+        (JSC::MacroAssembler::addPtr):
+        * assembler/MacroAssemblerARMv7.h:
+        (JSC::MacroAssemblerARMv7::bestTempRegister):
+        (JSC::MacroAssemblerARMv7::scratchRegister):
+        (JSC::MacroAssemblerARMv7::add32):
+        (JSC::MacroAssemblerARMv7::sub32):
+        (JSC::MacroAssemblerARMv7::loadPair32):
+        (JSC::MacroAssemblerARMv7::store32):
+        (JSC::MacroAssemblerARMv7::storePair32):
+        (JSC::MacroAssemblerARMv7::compare32AndSetFlags):
+        (JSC::MacroAssemblerARMv7::test32):
+        (JSC::MacroAssemblerARMv7::branch32):
+        (JSC::MacroAssemblerARMv7::farJump):
+        (JSC::MacroAssemblerARMv7::call):
+        (JSC::MacroAssemblerARMv7::compare32):
+        * assembler/MacroAssemblerMIPS.h:
+        (JSC::MacroAssemblerMIPS::loadPair32):
+        (JSC::MacroAssemblerMIPS::storePair32):
+        * jit/AssemblyHelpers.h:
+        (JSC::AssemblyHelpers::storeValue):
+        (JSC::AssemblyHelpers::loadValue):
+        * jit/JIT.cpp:
+        (JSC::JIT::privateCompileSlowCases):
+        * jit/JIT.h:
+        * jit/JITCall.cpp:
+        (JSC::JIT::compileCallEval):
+        (JSC::JIT::compileCallEvalSlowCase):
+        (JSC::JIT::compileOpCall):
+        (JSC::JIT::compileOpCallSlowCase):
+        (JSC::JIT::emitSlow_op_iterator_open):
+        (JSC::JIT::emitSlow_op_iterator_next):
+        * jit/JITInlineCacheGenerator.cpp:
+        (JSC::generateGetByIdInlineAccess):
+        (JSC::JITPutByIdGenerator::generateBaselineDataICFastPath):
+        * jit/JITInlineCacheGenerator.h:
+        * jit/JITInlines.h:
+        (JSC::JIT::setFastPathResumePoint):
+        (JSC::JIT::fastPathResumePoint const):
+        * jit/JITOpcodes.cpp:
+        (JSC::JIT::emit_op_enter):
+        * jit/JITPropertyAccess.cpp:
+        (JSC::JIT::emit_op_get_by_val):
+        (JSC::JIT::generateGetByValSlowCase):
+        (JSC::JIT::emit_op_get_private_name):
+        (JSC::JIT::emitSlow_op_get_private_name):
+        (JSC::JIT::emit_op_try_get_by_id):
+        (JSC::JIT::emitSlow_op_try_get_by_id):
+        (JSC::JIT::emit_op_get_by_id_direct):
+        (JSC::JIT::emitSlow_op_get_by_id_direct):
+        (JSC::JIT::emit_op_get_by_id):
+        (JSC::JIT::emitSlow_op_get_by_id):
+        (JSC::JIT::emit_op_get_by_id_with_this):
+        (JSC::JIT::emitSlow_op_get_by_id_with_this):
+        (JSC::JIT::emit_op_in_by_id):
+        (JSC::JIT::emitSlow_op_in_by_id):
+        (JSC::JIT::emit_op_in_by_val):
+        (JSC::JIT::emitSlow_op_in_by_val):
+        (JSC::JIT::emitHasPrivate):
+        (JSC::JIT::emitHasPrivateSlow):
+        (JSC::JIT::emitSlow_op_has_private_name):
+        (JSC::JIT::emitSlow_op_has_private_brand):
+        (JSC::JIT::emit_op_enumerator_get_by_val):
+        (JSC::JIT::emitWriteBarrier):
+
 2021-12-01  Adrian Perez de Castro  <[email protected]>
 
         Non-unified build fixes, early December 2021 edition

Modified: trunk/Source/_javascript_Core/assembler/ARMv7Assembler.h (286423 => 286424)


--- trunk/Source/_javascript_Core/assembler/ARMv7Assembler.h	2021-12-02 14:15:29 UTC (rev 286423)
+++ trunk/Source/_javascript_Core/assembler/ARMv7Assembler.h	2021-12-02 14:26:22 UTC (rev 286424)
@@ -562,6 +562,8 @@
     typedef enum {
         OP_B_T1         = 0xD000,
         OP_B_T2         = 0xE000,
+        OP_STRD_imm_T1  = 0xE840,
+        OP_LDRD_imm_T1  = 0xE850,
         OP_POP_T2       = 0xE8BD,
         OP_PUSH_T2      = 0xE92D,
         OP_AND_reg_T2   = 0xEA00,
@@ -810,7 +812,7 @@
     // NOTE: In an IT block, add doesn't modify the flags register.
     ALWAYS_INLINE void add(RegisterID rd, RegisterID rn, RegisterID rm)
     {
-        if (rd == ARMRegisters::sp) {
+        if (rd == ARMRegisters::sp && rd != rn) {
             mov(rd, rn);
             rn = rd;
         }
@@ -1240,6 +1242,47 @@
             m_formatter.twoWordOp12Reg4FourFours(OP_LDRSH_reg_T2, rn, FourFours(rt, 0, shift, rm));
     }
 
+    // If index is set, this is a regular offset or a pre-indexed load;
+    // if index is not set then it is a post-index load.
+    //
+    // If wback is set rn is updated - this is a pre or post index load,
+    // if wback is not set this is a regular offset memory access.
+    //
+    // (-1020 <= offset <= 1020)
+    // offset % 4 == 0
+    // _reg = REG[rn]
+    // _tmp = _reg + offset
+    // _addr = index ? _tmp : _reg
+    // REG[rt] = MEM[_addr]
+    // REG[rt2] = MEM[_addr + 4]
+    // if (wback) REG[rn] = _tmp
+    ALWAYS_INLINE void ldrd(RegisterID rt, RegisterID rt2, RegisterID rn, int offset, bool index, bool wback)
+    {
+        ASSERT(!BadReg(rt));
+        ASSERT(!BadReg(rt2));
+        ASSERT(rn != ARMRegisters::pc);
+        ASSERT(rt != rt2);
+        ASSERT(index || wback);
+        ASSERT(!wback | (rt != rn));
+        ASSERT(!wback | (rt2 != rn));
+        ASSERT(!(offset & 0x3));
+
+        bool add = true;
+        if (offset < 0) {
+            add = false;
+            offset = -offset;
+        }
+        offset >>= 2;
+        ASSERT(!(offset & ~0xff));
+
+        uint16_t opcode = OP_LDRD_imm_T1;
+        opcode |= (wback << 5);
+        opcode |= (add << 7);
+        opcode |= (index << 8);
+
+        m_formatter.twoWordOp12Reg4Reg4Reg4Imm8(static_cast<OpcodeID1>(opcode), rn, rt, rt2, offset);
+    }
+
     void lsl(RegisterID rd, RegisterID rm, int32_t shiftAmount)
     {
         ASSERT(!BadReg(rd));
@@ -1329,6 +1372,7 @@
 
     ALWAYS_INLINE void mov(RegisterID rd, RegisterID rm)
     {
+        ASSERT(rd != rm); // Use a NOP instead
         m_formatter.oneWordOp8RegReg143(OP_MOV_reg_T1, rm, rd);
     }
 
@@ -1676,6 +1720,46 @@
             m_formatter.twoWordOp12Reg4FourFours(OP_STRH_reg_T2, rn, FourFours(rt, 0, shift, rm));
     }
 
+    // If index is set, this is a regular offset or a pre-indexed load;
+    // if index is not set then it is a post-index load.
+    //
+    // If wback is set rn is updated - this is a pre or post index load,
+    // if wback is not set this is a regular offset memory access.
+    //
+    // (-1020 <= offset <= 1020)
+    // offset % 4 == 0
+    // _reg = REG[rn]
+    // _tmp = _reg + offset
+    // _addr = index ? _tmp : _reg
+    // MEM[_addr] = REG[rt]
+    // MEM[_addr + 4] = REG[rt2]
+    // if (wback) REG[rn] = _tmp
+    ALWAYS_INLINE void strd(RegisterID rt, RegisterID rt2, RegisterID rn, int offset, bool index, bool wback)
+    {
+        ASSERT(!BadReg(rt));
+        ASSERT(!BadReg(rt2));
+        ASSERT(rn != ARMRegisters::pc);
+        ASSERT(index || wback);
+        ASSERT(!wback | (rt != rn));
+        ASSERT(!wback | (rt2 != rn));
+        ASSERT(!(offset & 0x3));
+
+        bool add = true;
+        if (offset < 0) {
+            add = false;
+            offset = -offset;
+        }
+        offset >>= 2;
+        ASSERT(!(offset & ~0xff));
+
+        uint16_t opcode = OP_STRD_imm_T1;
+        opcode |= (wback << 5);
+        opcode |= (add << 7);
+        opcode |= (index << 8);
+
+        m_formatter.twoWordOp12Reg4Reg4Reg4Imm8(static_cast<OpcodeID1>(opcode), rn, rt, rt2, offset);
+    }
+
     ALWAYS_INLINE void sub(RegisterID rd, RegisterID rn, ARMThumbImmediate imm)
     {
         // Rd can only be SP if Rn is also SP.
@@ -2887,6 +2971,12 @@
             m_buffer.putShort((reg2 << 12) | imm);
         }
 
+        ALWAYS_INLINE void twoWordOp12Reg4Reg4Reg4Imm8(OpcodeID1 op, RegisterID reg1, RegisterID reg2, RegisterID reg3, uint8_t imm)
+        {
+            m_buffer.putShort(op | reg1);
+            m_buffer.putShort((reg2 << 12) | (reg3 << 8) | imm);
+        }
+
         ALWAYS_INLINE void twoWordOp12Reg40Imm3Reg4Imm20Imm5(OpcodeID1 op, RegisterID reg1, RegisterID reg2, uint16_t imm1, uint16_t imm2, uint16_t imm3)
         {
             m_buffer.putShort(op | reg1);

Modified: trunk/Source/_javascript_Core/assembler/MacroAssemblerARMv7.h (286423 => 286424)


--- trunk/Source/_javascript_Core/assembler/MacroAssemblerARMv7.h	2021-12-02 14:15:29 UTC (rev 286423)
+++ trunk/Source/_javascript_Core/assembler/MacroAssemblerARMv7.h	2021-12-02 14:26:22 UTC (rev 286424)
@@ -44,12 +44,22 @@
     static constexpr ARMRegisters::FPDoubleRegisterID fpTempRegister = ARMRegisters::d7;
     inline ARMRegisters::FPSingleRegisterID fpTempRegisterAsSingle() { return ARMRegisters::asSingle(fpTempRegister); }
 
+    // In the Thumb-2 instruction set, instructions operating only on registers r0-r7 can often
+    // be encoded using 16-bit encodings, while the use of registers r8 and above often require
+    // 32-bit encodings, so prefer to use the addressTemporary (r6) whenever possible.
+    inline RegisterID bestTempRegister(RegisterID excluded)
+    {
+        if (excluded == addressTempRegister)
+            return dataTempRegister;
+        return addressTempRegister;
+    }
+
 public:
 #define DUMMY_REGISTER_VALUE(id, name, r, cs) 0,
     static constexpr unsigned numGPRs = std::initializer_list<int>({ FOR_EACH_GP_REGISTER(DUMMY_REGISTER_VALUE) }).size();
     static constexpr unsigned numFPRs = std::initializer_list<int>({ FOR_EACH_FP_REGISTER(DUMMY_REGISTER_VALUE) }).size();
 #undef DUMMY_REGISTER_VALUE
-    RegisterID scratchRegister() { return dataTempRegister; }
+    RegisterID scratchRegister() { return addressTempRegister; }
 
     MacroAssemblerARMv7()
         : m_makeJumpPatchable(false)
@@ -181,20 +191,27 @@
 
     void add32(TrustedImm32 imm, RegisterID src, RegisterID dest)
     {
-        // For adds with stack pointer destination avoid unpredictable instruction
+        // Avoid unpredictable instruction if the destination is the stack pointer
         if (dest == ARMRegisters::sp && src != dest) {
-            add32(imm, src, dataTempRegister);
-            move(dataTempRegister, dest);
+            add32(imm, src, addressTempRegister);
+            move(addressTempRegister, dest);
             return;
         }
 
         ARMThumbImmediate armImm = ARMThumbImmediate::makeUInt12OrEncodedImm(imm.m_value);
-        if (armImm.isValid())
+        if (armImm.isValid()) {
             m_assembler.add(dest, src, armImm);
-        else {
-            move(imm, dataTempRegister);
-            m_assembler.add(dest, src, dataTempRegister);
+            return;
         }
+
+        armImm = ARMThumbImmediate::makeUInt12OrEncodedImm(-imm.m_value);
+        if (armImm.isValid()) {
+            m_assembler.sub(dest, src, armImm);
+            return;
+        }
+
+        move(imm, dataTempRegister);
+        m_assembler.add(dest, src, dataTempRegister);
     }
 
     void add32(TrustedImm32 imm, Address address)
@@ -842,17 +859,42 @@
 
     void loadPair32(RegisterID src, TrustedImm32 offset, RegisterID dest1, RegisterID dest2)
     {
-        // FIXME: ldrd/ldm can be used if dest1 and dest2 are consecutive pair of registers.
+        loadPair32(Address(src, offset.m_value), dest1, dest2);
+    }
+
+    void loadPair32(Address address, RegisterID dest1, RegisterID dest2)
+    {
         ASSERT(dest1 != dest2); // If it is the same, ldp becomes illegal instruction.
-        if (src == dest1) {
-            load32(Address(src, offset.m_value + 4), dest2);
-            load32(Address(src, offset.m_value), dest1);
+        int32_t absOffset = address.offset;
+        if (absOffset < 0)
+            absOffset = -absOffset;
+        if (!(absOffset & ~0x3fc))
+            m_assembler.ldrd(dest1, dest2, address.base, address.offset, /* index: */ true, /* wback: */ false);
+        else if (address.base == dest1) {
+            load32(address.withOffset(4), dest2);
+            load32(address, dest1);
         } else {
-            load32(Address(src, offset.m_value), dest1);
-            load32(Address(src, offset.m_value + 4), dest2);
+            load32(address, dest1);
+            load32(address.withOffset(4), dest2);
         }
     }
 
+    void loadPair32(BaseIndex address, RegisterID dest1, RegisterID dest2)
+    {
+        // Using r0-r7 can often be encoded with a shorter (16-bit vs 32-bit) instruction, so use
+        // whichever destination register is in that range (if any) as the address temp register
+        RegisterID scratch = dest1;
+        if (dest1 >= ARMRegisters::r8)
+            scratch = dest2;
+        if (address.scale == TimesOne)
+            m_assembler.add(scratch, address.base, address.index);
+        else {
+            ShiftTypeAndAmount shift { ARMShiftType::SRType_LSL, static_cast<unsigned>(address.scale) };
+            m_assembler.add(scratch, address.base, address.index, shift);
+        }
+        loadPair32(Address(scratch, address.offset), dest1, dest2);
+    }
+
     void store32(RegisterID src, Address address)
     {
         store32(src, setupArmAddress(address));
@@ -865,8 +907,12 @@
 
     void store32(TrustedImm32 imm, Address address)
     {
-        move(imm, dataTempRegister);
-        store32(dataTempRegister, setupArmAddress(address));
+        ArmAddress armAddress = setupArmAddress(address);
+        RegisterID scratch = addressTempRegister;
+        if (armAddress.type == ArmAddress::HasIndex)
+            scratch = dataTempRegister;
+        move(imm, scratch);
+        store32(scratch, armAddress);
     }
 
     void store32(TrustedImm32 imm, BaseIndex address)
@@ -951,11 +997,35 @@
 
     void storePair32(RegisterID src1, RegisterID src2, RegisterID dest, TrustedImm32 offset)
     {
-        // FIXME: strd/stm can be used if src1 and src2 are consecutive pair of registers.
-        store32(src1, Address(dest, offset.m_value));
-        store32(src2, Address(dest, offset.m_value + 4));
+        storePair32(src1, src2, Address(dest, offset.m_value));
     }
 
+    void storePair32(RegisterID src1, RegisterID src2, Address address)
+    {
+        int32_t absOffset = address.offset;
+        if (absOffset < 0)
+            absOffset = -absOffset;
+        if (!(absOffset & ~0x3fc))
+            m_assembler.strd(src1, src2, address.base, address.offset, /* index: */ true, /* wback: */ false);
+        else {
+            store32(src1, address);
+            store32(src2, address.withOffset(4));
+        }
+    }
+
+    void storePair32(RegisterID src1, RegisterID src2, BaseIndex address)
+    {
+        ASSERT(src1 != dataTempRegister && src2 != dataTempRegister);
+        // The 'addressTempRegister' might be used when the offset is wide, so use 'dataTempRegister'
+        if (address.scale == TimesOne)
+            m_assembler.add(dataTempRegister, address.base, address.index);
+        else {
+            ShiftTypeAndAmount shift { ARMShiftType::SRType_LSL, static_cast<unsigned>(address.scale) };
+            m_assembler.add(dataTempRegister, address.base, address.index, shift);
+        }
+        storePair32(src1, src2, Address(dataTempRegister, address.offset));
+    }
+
     // Possibly clobbers src, but not on this architecture.
     void moveDoubleToInts(FPRegisterID src, RegisterID dest1, RegisterID dest2)
     {
@@ -1522,8 +1592,9 @@
         else if ((armImm = ARMThumbImmediate::makeEncodedImm(-imm)).isValid())
             m_assembler.cmn(left, armImm);
         else {
-            move(TrustedImm32(imm), dataTempRegister);
-            m_assembler.cmp(left, dataTempRegister);
+            RegisterID scratch = bestTempRegister(left);
+            move(TrustedImm32(imm), scratch);
+            m_assembler.cmp(left, scratch);
         }
     }
 
@@ -1589,12 +1660,13 @@
                 } else
                     m_assembler.tst(reg, armImm);
             } else {
-                move(mask, dataTempRegister);
                 if (reg == ARMRegisters::sp) {
-                    move(reg, addressTempRegister);
-                    m_assembler.tst(addressTempRegister, dataTempRegister);
-                } else
-                    m_assembler.tst(reg, dataTempRegister);
+                    move(reg, dataTempRegister);
+                    reg = dataTempRegister;
+                }
+                RegisterID scratch = bestTempRegister(reg);
+                move(mask, scratch);
+                m_assembler.tst(reg, scratch);
             }
         }
     }
@@ -1625,14 +1697,14 @@
 
     Jump branch32(RelationalCondition cond, RegisterID left, Address right)
     {
-        load32(right, dataTempRegister);
-        return branch32(cond, left, dataTempRegister);
+        load32(right, addressTempRegister);
+        return branch32(cond, left, addressTempRegister);
     }
 
     Jump branch32(RelationalCondition cond, Address left, RegisterID right)
     {
-        load32(left, dataTempRegister);
-        return branch32(cond, dataTempRegister, right);
+        load32(left, addressTempRegister);
+        return branch32(cond, addressTempRegister, right);
     }
 
     Jump branch32(RelationalCondition cond, Address left, TrustedImm32 right)
@@ -1658,13 +1730,12 @@
 
     Jump branch32(RelationalCondition cond, AbsoluteAddress left, RegisterID right)
     {
-        load32(left.m_ptr, dataTempRegister);
-        return branch32(cond, dataTempRegister, right);
+        load32(left.m_ptr, addressTempRegister);
+        return branch32(cond, addressTempRegister, right);
     }
 
     Jump branch32(RelationalCondition cond, AbsoluteAddress left, TrustedImm32 right)
     {
-        // use addressTempRegister incase the branch32 we call uses dataTempRegister. :-/
         load32(left.m_ptr, addressTempRegister);
         return branch32(cond, addressTempRegister, right);
     }
@@ -1800,22 +1871,22 @@
 
     void farJump(TrustedImmPtr target, PtrTag)
     {
-        move(target, dataTempRegister);
-        m_assembler.bx(dataTempRegister);
+        move(target, addressTempRegister);
+        m_assembler.bx(addressTempRegister);
     }
 
     // Address is a memory location containing the address to jump to
     void farJump(Address address, PtrTag)
     {
-        load32(address, dataTempRegister);
-        m_assembler.bx(dataTempRegister);
+        load32(address, addressTempRegister);
+        m_assembler.bx(addressTempRegister);
     }
     
     void farJump(AbsoluteAddress address, PtrTag)
     {
-        move(TrustedImmPtr(address.m_ptr), dataTempRegister);
-        load32(Address(dataTempRegister), dataTempRegister);
-        m_assembler.bx(dataTempRegister);
+        move(TrustedImmPtr(address.m_ptr), addressTempRegister);
+        load32(Address(addressTempRegister), addressTempRegister);
+        m_assembler.bx(addressTempRegister);
     }
 
     ALWAYS_INLINE void farJump(RegisterID target, RegisterID jumpTag) { UNUSED_PARAM(jumpTag), farJump(target, NoPtrTag); }
@@ -1991,8 +2062,8 @@
 
     ALWAYS_INLINE Call call(Address address, PtrTag)
     {
-        load32(address, dataTempRegister);
-        return Call(m_assembler.blx(dataTempRegister), Call::None);
+        load32(address, addressTempRegister);
+        return Call(m_assembler.blx(addressTempRegister), Call::None);
     }
 
     ALWAYS_INLINE Call call(RegisterID callTag) { return UNUSED_PARAM(callTag), call(NoPtrTag); }
@@ -2014,8 +2085,8 @@
 
     void compare32(RelationalCondition cond, Address left, RegisterID right, RegisterID dest)
     {
-        load32(left, dataTempRegister);
-        compare32(cond, dataTempRegister, right, dest);
+        load32(left, addressTempRegister);
+        compare32(cond, addressTempRegister, right, dest);
     }
 
     void compare8(RelationalCondition cond, Address left, TrustedImm32 right, RegisterID dest)

Modified: trunk/Source/_javascript_Core/assembler/MacroAssemblerMIPS.h (286423 => 286424)


--- trunk/Source/_javascript_Core/assembler/MacroAssemblerMIPS.h	2021-12-02 14:15:29 UTC (rev 286423)
+++ trunk/Source/_javascript_Core/assembler/MacroAssemblerMIPS.h	2021-12-02 14:26:22 UTC (rev 286424)
@@ -1321,16 +1321,35 @@
 
     void loadPair32(RegisterID src, TrustedImm32 offset, RegisterID dest1, RegisterID dest2)
     {
+        loadPair32(Address(src, offset.m_value), dest1, dest2);
+    }
+
+    void loadPair32(Address address, RegisterID dest1, RegisterID dest2)
+    {
         ASSERT(dest1 != dest2); // If it is the same, ldp becomes illegal instruction.
-        if (src == dest1) {
-            load32(Address(src, offset.m_value + 4), dest2);
-            load32(Address(src, offset.m_value), dest1);
+        if (address.base == dest1) {
+            load32(address.withOffset(4), dest2);
+            load32(address, dest1);
         } else {
-            load32(Address(src, offset.m_value), dest1);
-            load32(Address(src, offset.m_value + 4), dest2);
+            load32(address, dest1);
+            load32(address.withOffset(4), dest2);
         }
     }
 
+    void loadPair32(BaseIndex address, RegisterID dest1, RegisterID dest2)
+    {
+        if (address.base == dest1 || address.index == dest1) {
+            RELEASE_ASSERT(address.base != dest2);
+            RELEASE_ASSERT(address.index != dest2);
+
+            load32(address.withOffset(4), dest2);
+            load32(address, dest1);
+        } else {
+            load32(address, dest1);
+            load32(address.withOffset(4), dest2);
+        }
+    }
+
     void store8(RegisterID src, Address address)
     {
         if (address.offset >= -32768 && address.offset <= 32767
@@ -1631,10 +1650,21 @@
 
     void storePair32(RegisterID src1, RegisterID src2, RegisterID dest, TrustedImm32 offset)
     {
-        store32(src1, Address(dest, offset.m_value));
-        store32(src2, Address(dest, offset.m_value + 4));
+        storePair32(src1, src2, Address(dest, offset.m_value));
     }
 
+    void storePair32(RegisterID src1, RegisterID src2, Address address)
+    {
+        store32(src1, address);
+        store32(src2, address.withOffset(4));
+    }
+
+    void storePair32(RegisterID src1, RegisterID src2, BaseIndex address)
+    {
+        store32(src1, address);
+        store32(src2, address.withOffset(4));
+    }
+
     // Floating-point operations:
 
     static bool supportsFloatingPoint()

Modified: trunk/Source/_javascript_Core/jit/AssemblyHelpers.h (286423 => 286424)


--- trunk/Source/_javascript_Core/jit/AssemblyHelpers.h	2021-12-02 14:15:29 UTC (rev 286423)
+++ trunk/Source/_javascript_Core/jit/AssemblyHelpers.h	2021-12-02 14:26:22 UTC (rev 286424)
@@ -164,8 +164,8 @@
 #if USE(JSVALUE64)
         store64(regs.gpr(), address);
 #else
-        store32(regs.payloadGPR(), address.withOffset(PayloadOffset));
-        store32(regs.tagGPR(), address.withOffset(TagOffset));
+        static_assert(!PayloadOffset && TagOffset == 4, "Assumes little-endian system");
+        storePair32(regs.payloadGPR(), regs.tagGPR(), address);
 #endif
     }
     
@@ -174,8 +174,8 @@
 #if USE(JSVALUE64)
         store64(regs.gpr(), address);
 #else
-        store32(regs.payloadGPR(), address.withOffset(PayloadOffset));
-        store32(regs.tagGPR(), address.withOffset(TagOffset));
+        static_assert(!PayloadOffset && TagOffset == 4, "Assumes little-endian system");
+        storePair32(regs.payloadGPR(), regs.tagGPR(), address);
 #endif
     }
     
@@ -194,13 +194,8 @@
 #if USE(JSVALUE64)
         load64(address, regs.gpr());
 #else
-        if (address.base == regs.payloadGPR()) {
-            load32(address.withOffset(TagOffset), regs.tagGPR());
-            load32(address.withOffset(PayloadOffset), regs.payloadGPR());
-        } else {
-            load32(address.withOffset(PayloadOffset), regs.payloadGPR());
-            load32(address.withOffset(TagOffset), regs.tagGPR());
-        }
+        static_assert(!PayloadOffset && TagOffset == 4, "Assumes little-endian system");
+        loadPair32(address, regs.payloadGPR(), regs.tagGPR());
 #endif
     }
     
@@ -209,18 +204,8 @@
 #if USE(JSVALUE64)
         load64(address, regs.gpr());
 #else
-        if (address.base == regs.payloadGPR() || address.index == regs.payloadGPR()) {
-            // We actually could handle the case where the registers are aliased to both
-            // tag and payload, but we don't for now.
-            RELEASE_ASSERT(address.base != regs.tagGPR());
-            RELEASE_ASSERT(address.index != regs.tagGPR());
-            
-            load32(address.withOffset(TagOffset), regs.tagGPR());
-            load32(address.withOffset(PayloadOffset), regs.payloadGPR());
-        } else {
-            load32(address.withOffset(PayloadOffset), regs.payloadGPR());
-            load32(address.withOffset(TagOffset), regs.tagGPR());
-        }
+        static_assert(!PayloadOffset && TagOffset == 4, "Assumes little-endian system");
+        loadPair32(address, regs.payloadGPR(), regs.tagGPR());
 #endif
     }
 

Modified: trunk/Source/_javascript_Core/jit/JIT.cpp (286423 => 286424)


--- trunk/Source/_javascript_Core/jit/JIT.cpp	2021-12-02 14:15:29 UTC (rev 286423)
+++ trunk/Source/_javascript_Core/jit/JIT.cpp	2021-12-02 14:26:22 UTC (rev 286424)
@@ -159,14 +159,11 @@
     checkStackPointerAlignment();
 }
 
-#define NEXT_OPCODE(name) \
-    m_bytecodeIndex = BytecodeIndex(m_bytecodeIndex.offset() + currentInstruction->size()); \
-    break;
-
 #define NEXT_OPCODE_IN_MAIN(name) \
     if (previousSlowCasesSize != m_slowCases.size()) \
         ++m_bytecodeCountHavingSlowCase; \
-    NEXT_OPCODE(name)
+    m_bytecodeIndex = BytecodeIndex(m_bytecodeIndex.offset() + currentInstruction->size()); \
+    break;
 
 #define DEFINE_SLOW_OP(name) \
     case op_##name: { \
@@ -188,13 +185,13 @@
 #define DEFINE_SLOWCASE_OP(name) \
     case name: { \
         emitSlow_##name(currentInstruction, iter); \
-        NEXT_OPCODE(name); \
+        break; \
     }
 
 #define DEFINE_SLOWCASE_SLOW_OP(name) \
     case op_##name: { \
         emitSlowCaseCall(currentInstruction, iter, slow_path_##name); \
-        NEXT_OPCODE(op_##name); \
+        break; \
     }
 
 void JIT::emitSlowCaseCall(const Instruction* currentInstruction, Vector<SlowCaseEntry>::iterator& iter, SlowPathFunction stub)
@@ -675,12 +672,14 @@
 
         RELEASE_ASSERT_WITH_MESSAGE(iter == m_slowCases.end() || firstTo.offset() != iter->to.offset(), "Not enough jumps linked in slow case codegen.");
         RELEASE_ASSERT_WITH_MESSAGE(firstTo.offset() == (iter - 1)->to.offset(), "Too many jumps linked in slow case codegen.");
-        
-        emitJumpSlowToHot(jump(), 0);
+
+        jump().linkTo(fastPathResumePoint(), this);
         ++bytecodeCountHavingSlowCase;
 
-        if (UNLIKELY(sizeMarker))
+        if (UNLIKELY(sizeMarker)) {
+            m_bytecodeIndex = BytecodeIndex(m_bytecodeIndex.offset() + currentInstruction->size());
             m_vm->jitSizeStatistics->markEnd(WTFMove(*sizeMarker), *this);
+        }
     }
 
     RELEASE_ASSERT(bytecodeCountHavingSlowCase == m_bytecodeCountHavingSlowCase);

Modified: trunk/Source/_javascript_Core/jit/JIT.h (286423 => 286424)


--- trunk/Source/_javascript_Core/jit/JIT.h	2021-12-02 14:15:29 UTC (rev 286423)
+++ trunk/Source/_javascript_Core/jit/JIT.h	2021-12-02 14:26:22 UTC (rev 286424)
@@ -280,6 +280,8 @@
 
         void advanceToNextCheckpoint();
         void emitJumpSlowToHotForCheckpoint(Jump);
+        void setFastPathResumePoint();
+        Label fastPathResumePoint() const;
 
         void addSlowCase(Jump);
         void addSlowCase(const JumpList&);
@@ -620,7 +622,7 @@
         void emitSlow_op_iterator_next(const Instruction*, Vector<SlowCaseEntry>::iterator&);
 
         void emitHasPrivate(VirtualRegister dst, VirtualRegister base, VirtualRegister propertyOrBrand, AccessType);
-        void emitHasPrivateSlow(VirtualRegister dst, VirtualRegister base, VirtualRegister property, AccessType);
+        void emitHasPrivateSlow(VirtualRegister base, VirtualRegister property, AccessType);
 
         template<typename Op>
         void emitNewFuncCommon(const Instruction*);
@@ -630,8 +632,6 @@
         void emitVarReadOnlyCheck(ResolveType, GPRReg scratchGPR);
         void emitNotifyWriteWatchpoint(GPRReg pointerToSet);
 
-        void emitInitRegister(VirtualRegister);
-
         bool isKnownCell(VirtualRegister);
 
         JSValue getConstantOperand(VirtualRegister);
@@ -911,6 +911,7 @@
         Vector<NearJumpRecord> m_nearJumps;
         Vector<Label> m_labels;
         HashMap<BytecodeIndex, Label> m_checkpointLabels;
+        HashMap<BytecodeIndex, Label> m_fastPathResumeLabels;
         Vector<JITGetByIdGenerator> m_getByIds;
         Vector<JITGetByValGenerator> m_getByVals;
         Vector<JITGetByIdWithThisGenerator> m_getByIdsWithThis;

Modified: trunk/Source/_javascript_Core/jit/JITCall.cpp (286423 => 286424)


--- trunk/Source/_javascript_Core/jit/JITCall.cpp	2021-12-02 14:15:29 UTC (rev 286423)
+++ trunk/Source/_javascript_Core/jit/JITCall.cpp	2021-12-02 14:26:22 UTC (rev 286424)
@@ -185,6 +185,7 @@
     callOperation(operationCallEval, argumentGPR0, argumentGPR1, argumentGPR2);
     addSlowCase(branchIfEmpty(returnValueJSR));
 
+    setFastPathResumePoint();
     emitPutCallResult(bytecode);
 
     return true;
@@ -205,8 +206,6 @@
     materializePointerIntoMetadata(bytecode, OpCallEval::Metadata::offsetOfCallLinkInfo(), callLinkInfoGPR);
     emitVirtualCallWithoutMovingGlobalObject(*m_vm, callLinkInfoGPR, CallMode::Regular);
     resetSP();
-
-    emitPutCallResult(bytecode);
 }
 
 template<typename Op>
@@ -302,8 +301,9 @@
 
     m_callCompilationInfo[callLinkInfoIndex].doneLocation = doneLocation;
 
+    if constexpr (Op::opcodeID != op_iterator_open && Op::opcodeID != op_iterator_next)
+        setFastPathResumePoint();
     resetSP();
-
     emitPutCallResult(bytecode);
 }
 
@@ -328,10 +328,6 @@
         abortWithReason(JITDidReturnFromTailCall);
         return;
     }
-
-    resetSP();
-
-    emitPutCallResult(bytecode);
 }
 
 void JIT::emit_op_call(const Instruction* currentInstruction)
@@ -470,8 +466,12 @@
 
 void JIT::emitSlow_op_iterator_open(const Instruction* instruction, Vector<SlowCaseEntry>::iterator& iter)
 {
+    auto bytecode = instruction->as<OpIteratorOpen>();
+
     linkAllSlowCases(iter);
     compileOpCallSlowCase<OpIteratorOpen>(instruction, iter, m_callLinkInfoIndex++);
+    resetSP();
+    emitPutCallResult(bytecode);
     emitJumpSlowToHotForCheckpoint(jump());
 
     linkAllSlowCases(iter);
@@ -480,7 +480,6 @@
     notObject.append(branchIfNotCell(iteratorJSR));
     notObject.append(branchIfNotObject(iteratorJSR.payloadGPR()));
 
-    auto bytecode = instruction->as<OpIteratorOpen>();
     VirtualRegister nextVReg = bytecode.m_next;
     UniquedStringImpl* ident = vm().propertyNames->next.impl();
 
@@ -615,8 +614,12 @@
 
 void JIT::emitSlow_op_iterator_next(const Instruction* instruction, Vector<SlowCaseEntry>::iterator& iter)
 {
+    auto bytecode = instruction->as<OpIteratorNext>();
+
     linkAllSlowCases(iter);
     compileOpCallSlowCase<OpIteratorNext>(instruction, iter, m_callLinkInfoIndex++);
+    resetSP();
+    emitPutCallResult(bytecode);
     emitJumpSlowToHotForCheckpoint(jump());
 
     using BaselineGetByIdRegisters::resultJSR;
@@ -624,7 +627,6 @@
 
     constexpr JSValueRegs iterCallResultJSR = dontClobberJSR;
 
-    auto bytecode = instruction->as<OpIteratorNext>();
     {
         VirtualRegister doneVReg = bytecode.m_done;
 

Modified: trunk/Source/_javascript_Core/jit/JITInlineCacheGenerator.cpp (286423 => 286424)


--- trunk/Source/_javascript_Core/jit/JITInlineCacheGenerator.cpp	2021-12-02 14:15:29 UTC (rev 286423)
+++ trunk/Source/_javascript_Core/jit/JITInlineCacheGenerator.cpp	2021-12-02 14:26:22 UTC (rev 286424)
@@ -135,18 +135,12 @@
 
 static void generateGetByIdInlineAccess(JIT& jit, GPRReg stubInfoGPR, JSValueRegs baseJSR, GPRReg scratchGPR, JSValueRegs resultJSR)
 {
-    CCallHelpers::JumpList done;
-
     jit.load32(CCallHelpers::Address(baseJSR.payloadGPR(), JSCell::structureIDOffset()), scratchGPR);
-    auto skipInlineAccess = jit.branch32(CCallHelpers::NotEqual, scratchGPR, CCallHelpers::Address(stubInfoGPR, StructureStubInfo::offsetOfInlineAccessBaseStructure()));
+    auto doInlineAccess = jit.branch32(CCallHelpers::Equal, scratchGPR, CCallHelpers::Address(stubInfoGPR, StructureStubInfo::offsetOfInlineAccessBaseStructure()));
+    jit.farJump(CCallHelpers::Address(stubInfoGPR, StructureStubInfo::offsetOfCodePtr()), JITStubRoutinePtrTag);
+    doInlineAccess.link(&jit);
     jit.load32(CCallHelpers::Address(stubInfoGPR, StructureStubInfo::offsetOfByIdSelfOffset()), scratchGPR);
     jit.loadProperty(baseJSR.payloadGPR(), scratchGPR, resultJSR);
-    auto finished = jit.jump();
-
-    skipInlineAccess.link(&jit);
-    jit.farJump(CCallHelpers::Address(stubInfoGPR, StructureStubInfo::offsetOfCodePtr()), JITStubRoutinePtrTag);
-
-    finished.link(&jit);
 }
 
 void JITGetByIdGenerator::generateBaselineDataICFastPath(JIT& jit, unsigned stubInfo, GPRReg stubInfoGPR)
@@ -227,15 +221,11 @@
     using BaselinePutByIdRegisters::scratch2GPR;
 
     jit.load32(CCallHelpers::Address(baseJSR.payloadGPR(), JSCell::structureIDOffset()), scratchGPR);
-    auto skipInlineAccess = jit.branch32(CCallHelpers::NotEqual, scratchGPR, CCallHelpers::Address(stubInfoGPR, StructureStubInfo::offsetOfInlineAccessBaseStructure()));
+    auto doInlineAccess = jit.branch32(CCallHelpers::Equal, scratchGPR, CCallHelpers::Address(stubInfoGPR, StructureStubInfo::offsetOfInlineAccessBaseStructure()));
+    jit.farJump(CCallHelpers::Address(stubInfoGPR, StructureStubInfo::offsetOfCodePtr()), JITStubRoutinePtrTag);
+    doInlineAccess.link(&jit);
     jit.load32(CCallHelpers::Address(stubInfoGPR, StructureStubInfo::offsetOfByIdSelfOffset()), scratchGPR);
     jit.storeProperty(valueJSR, baseJSR.payloadGPR(), scratchGPR, scratch2GPR);
-    auto finished = jit.jump();
-
-    skipInlineAccess.link(&jit);
-    jit.farJump(CCallHelpers::Address(stubInfoGPR, StructureStubInfo::offsetOfCodePtr()), JITStubRoutinePtrTag);
-
-    finished.link(&jit);
     m_done = jit.label();
 }
 

Modified: trunk/Source/_javascript_Core/jit/JITInlineCacheGenerator.h (286423 => 286424)


--- trunk/Source/_javascript_Core/jit/JITInlineCacheGenerator.h	2021-12-02 14:15:29 UTC (rev 286423)
+++ trunk/Source/_javascript_Core/jit/JITInlineCacheGenerator.h	2021-12-02 14:26:22 UTC (rev 286424)
@@ -49,16 +49,15 @@
 enum class JITType : uint8_t;
 
 namespace BaselineDelByValRegisters {
+constexpr JSValueRegs resultJSR { JSRInfo::returnValueJSR };
 #if USE(JSVALUE64)
 constexpr JSValueRegs baseJSR { GPRInfo::regT1 };
 constexpr JSValueRegs propertyJSR { GPRInfo::regT0 };
-constexpr JSValueRegs resultJSR { propertyJSR };
 constexpr GPRReg stubInfoGPR { GPRInfo::regT3 };
 constexpr GPRReg scratchGPR { GPRInfo::regT2 };
 #elif USE(JSVALUE32_64)
 constexpr JSValueRegs baseJSR { GPRInfo::regT3, GPRInfo::regT2 };
 constexpr JSValueRegs propertyJSR { GPRInfo::regT1, GPRInfo::regT0 };
-constexpr JSValueRegs resultJSR { propertyJSR };
 constexpr GPRReg stubInfoGPR { GPRInfo::regT7 };
 constexpr GPRReg scratchGPR { GPRInfo::regT6 };
 #endif
@@ -65,14 +64,13 @@
 }
 
 namespace BaselineDelByIdRegisters {
+constexpr JSValueRegs resultJSR { JSRInfo::returnValueJSR };
 #if USE(JSVALUE64)
 constexpr JSValueRegs baseJSR { GPRInfo::regT1 };
-constexpr JSValueRegs resultJSR { GPRInfo::regT0 };
 constexpr GPRReg stubInfoGPR { GPRInfo::regT3 };
 constexpr GPRReg scratchGPR { GPRInfo::regT2 };
 #elif USE(JSVALUE32_64)
 constexpr JSValueRegs baseJSR { GPRInfo::regT3, GPRInfo::regT2 };
-constexpr JSValueRegs resultJSR { GPRInfo::regT1, GPRInfo::regT0 };
 constexpr GPRReg stubInfoGPR { GPRInfo::regT7 };
 constexpr GPRReg scratchGPR { GPRInfo::regT6 };
 #endif
@@ -79,16 +77,15 @@
 }
 
 namespace BaselineGetByValRegisters {
+constexpr JSValueRegs resultJSR { JSRInfo::returnValueJSR };
 #if USE(JSVALUE64)
 constexpr JSValueRegs baseJSR { GPRInfo::regT0 };
 constexpr JSValueRegs propertyJSR { GPRInfo::regT1 };
-constexpr JSValueRegs resultJSR { baseJSR };
 constexpr GPRReg stubInfoGPR { GPRInfo::regT2 };
 constexpr GPRReg scratchGPR { GPRInfo::regT3 };
 #elif USE(JSVALUE32_64)
 constexpr JSValueRegs baseJSR { GPRInfo::regT1, GPRInfo::regT0 };
 constexpr JSValueRegs propertyJSR { GPRInfo::regT3, GPRInfo::regT2 };
-constexpr JSValueRegs resultJSR { baseJSR };
 constexpr GPRReg stubInfoGPR { GPRInfo::regT7 };
 constexpr GPRReg scratchGPR { GPRInfo::regT6 };
 #endif
@@ -136,16 +133,15 @@
 }
 
 namespace BaselineInByValRegisters {
+constexpr JSValueRegs resultJSR { JSRInfo::returnValueJSR };
 #if USE(JSVALUE64)
 constexpr JSValueRegs baseJSR { GPRInfo::regT0 };
 constexpr JSValueRegs propertyJSR { GPRInfo::regT1 };
-constexpr JSValueRegs resultJSR { baseJSR };
 constexpr GPRReg stubInfoGPR { GPRInfo::regT2 };
 constexpr GPRReg scratchGPR { GPRInfo::regT3 };
 #elif USE(JSVALUE32_64)
 constexpr JSValueRegs baseJSR { GPRInfo::regT1, GPRInfo::regT0 };
 constexpr JSValueRegs propertyJSR  { GPRInfo::regT3, GPRInfo::regT2 };
-constexpr JSValueRegs resultJSR { baseJSR };
 constexpr GPRReg stubInfoGPR { GPRInfo::regT7 };
 constexpr GPRReg scratchGPR { GPRInfo::regT6 };
 #endif
@@ -154,31 +150,29 @@
 }
 
 namespace BaselineGetByIdRegisters {
+constexpr JSValueRegs resultJSR { JSRInfo::returnValueJSR };
+constexpr JSValueRegs baseJSR { JSRInfo::returnValueJSR };
 #if USE(JSVALUE64)
-constexpr JSValueRegs baseJSR { GPRInfo::regT0 };
-constexpr JSValueRegs resultJSR { baseJSR };
 constexpr GPRReg stubInfoGPR { GPRInfo::regT1 };
 constexpr GPRReg scratchGPR { GPRInfo::regT2 };
 constexpr JSValueRegs dontClobberJSR { GPRInfo::regT3 };
 #elif USE(JSVALUE32_64)
-constexpr JSValueRegs baseJSR { GPRInfo::regT1, GPRInfo::regT0 };
-constexpr JSValueRegs resultJSR { baseJSR };
 constexpr GPRReg stubInfoGPR { GPRInfo::regT2 };
 constexpr GPRReg scratchGPR { GPRInfo::regT3 };
 constexpr JSValueRegs dontClobberJSR { GPRInfo::regT6, GPRInfo::regT7 };
 #endif
+static_assert(AssemblyHelpers::noOverlap(baseJSR, stubInfoGPR, scratchGPR, dontClobberJSR));
 }
 
 namespace BaselineGetByIdWithThisRegisters {
+constexpr JSValueRegs resultJSR { JSRInfo::returnValueJSR };
 #if USE(JSVALUE64)
 constexpr JSValueRegs baseJSR { GPRInfo::regT0 };
-constexpr JSValueRegs resultJSR { baseJSR };
 constexpr JSValueRegs thisJSR { GPRInfo::regT1 };
 constexpr GPRReg stubInfoGPR { GPRInfo::regT2 };
 constexpr GPRReg scratchGPR { GPRInfo::regT3 };
 #elif USE(JSVALUE32_64)
 constexpr JSValueRegs baseJSR { GPRInfo::regT1, GPRInfo::regT0 };
-constexpr JSValueRegs resultJSR { baseJSR };
 constexpr JSValueRegs thisJSR { GPRInfo::regT3, GPRInfo::regT2 };
 constexpr GPRReg stubInfoGPR { GPRInfo::regT7 };
 constexpr GPRReg scratchGPR { GPRInfo::regT6 };
@@ -187,7 +181,7 @@
 
 namespace BaselineInByIdRegisters {
 constexpr JSValueRegs baseJSR { BaselineGetByIdRegisters::baseJSR };
-constexpr JSValueRegs resultJSR { BaselineGetByIdRegisters::resultJSR };
+constexpr JSValueRegs resultJSR { JSRInfo::returnValueJSR };
 constexpr GPRReg stubInfoGPR { BaselineGetByIdRegisters::stubInfoGPR };
 constexpr GPRReg scratchGPR { BaselineGetByIdRegisters::scratchGPR };
 }
@@ -204,7 +198,7 @@
 constexpr JSValueRegs valueJSR { GPRInfo::regT3, GPRInfo::regT2 };
 constexpr GPRReg stubInfoGPR { GPRInfo::regT7 };
 constexpr GPRReg scratchGPR { GPRInfo::regT6 };
-constexpr GPRReg scratch2GPR { GPRInfo::regT4 };
+constexpr GPRReg scratch2GPR { baseJSR.tagGPR() }; // Reusing regT1 for better code size on ARM_THUMB2
 #endif
 }
 

Modified: trunk/Source/_javascript_Core/jit/JITInlines.h (286423 => 286424)


--- trunk/Source/_javascript_Core/jit/JITInlines.h	2021-12-02 14:15:29 UTC (rev 286423)
+++ trunk/Source/_javascript_Core/jit/JITInlines.h	2021-12-02 14:26:22 UTC (rev 286424)
@@ -215,6 +215,26 @@
     jump.linkTo(iter->value, this);
 }
 
+inline void JIT::setFastPathResumePoint()
+{
+    ASSERT_WITH_MESSAGE(m_bytecodeIndex, "This method should only be called during hot/cold path generation, so that m_bytecodeIndex is set");
+    auto result = m_fastPathResumeLabels.add(m_bytecodeIndex, label());
+    ASSERT_UNUSED(result, result.isNewEntry);
+}
+
+inline MacroAssembler::Label JIT::fastPathResumePoint() const
+{
+    ASSERT_WITH_MESSAGE(m_bytecodeIndex, "This method should only be called during hot/cold path generation, so that m_bytecodeIndex is set");
+    // Location set by setFastPathResumePoint
+    auto iter = m_fastPathResumeLabels.find(m_bytecodeIndex);
+    if (iter != m_fastPathResumeLabels.end())
+        return iter->value;
+    // Next instruction in sequence
+    const Instruction* currentInstruction = m_unlinkedCodeBlock->instructions().at(m_bytecodeIndex).ptr();
+    return m_labels[m_bytecodeIndex.offset() + currentInstruction->size()];
+}
+
+
 ALWAYS_INLINE void JIT::addSlowCase(Jump jump)
 {
     ASSERT(m_bytecodeIndex); // This method should only be called during hot/cold path generation, so that m_bytecodeIndex is set.
@@ -329,11 +349,6 @@
 }
 #endif
 
-ALWAYS_INLINE void JIT::emitInitRegister(VirtualRegister dst)
-{
-    storeTrustedValue(jsUndefined(), addressFor(dst));
-}
-
 ALWAYS_INLINE void JIT::emitGetVirtualRegister(VirtualRegister src, JSValueRegs dst)
 {
     ASSERT(m_bytecodeIndex); // This method should only be called during hot/cold path generation, so that m_bytecodeIndex is set.

Modified: trunk/Source/_javascript_Core/jit/JITOpcodes.cpp (286423 => 286424)


--- trunk/Source/_javascript_Core/jit/JITOpcodes.cpp	2021-12-02 14:15:29 UTC (rev 286423)
+++ trunk/Source/_javascript_Core/jit/JITOpcodes.cpp	2021-12-02 14:26:22 UTC (rev 286424)
@@ -1253,10 +1253,12 @@
     // object lifetime and increasing GC pressure.
     size_t count = m_unlinkedCodeBlock->numVars();
 #if !ENABLE(EXTRA_CTI_THUNKS)
-    for (size_t j = CodeBlock::llintBaselineCalleeSaveSpaceAsVirtualRegisters(); j < count; ++j)
-        emitInitRegister(virtualRegisterForLocal(j));
+    size_t first = CodeBlock::llintBaselineCalleeSaveSpaceAsVirtualRegisters();
+    if (first < count)
+        moveTrustedValue(jsUndefined(), jsRegT10);
+    for (size_t j = first; j < count; ++j)
+        emitPutVirtualRegister(virtualRegisterForLocal(j), jsRegT10);
 
-    
     loadPtr(addressFor(CallFrameSlot::codeBlock), regT0);
     emitWriteBarrier(regT0);
 

Modified: trunk/Source/_javascript_Core/jit/JITPropertyAccess.cpp (286423 => 286424)


--- trunk/Source/_javascript_Core/jit/JITPropertyAccess.cpp	2021-12-02 14:15:29 UTC (rev 286423)
+++ trunk/Source/_javascript_Core/jit/JITPropertyAccess.cpp	2021-12-02 14:26:22 UTC (rev 286424)
@@ -89,6 +89,7 @@
         addSlowCase();
         m_getByVals.append(gen);
 
+        setFastPathResumePoint();
         emitValueProfilingSite(bytecode, resultJSR);
         emitPutVirtualRegister(dst, resultJSR);
     }
@@ -100,8 +101,6 @@
     if (!hasAnySlowCases(iter))
         return;
 
-    VirtualRegister dst = bytecode.m_dst;
-
     linkAllSlowCases(iter);
 
     JITGetByValGenerator& gen = m_getByVals[m_getByValIndex++];
@@ -122,10 +121,8 @@
     loadGlobalObject(globalObjectGPR);
     loadConstant(gen.m_unlinkedStubInfoConstantIndex, stubInfoGPR);
     materializePointerIntoMetadata(bytecode, OpcodeType::Metadata::offsetOfArrayProfile(), profileGPR);
-    callOperationWithProfile<SlowOperation>(
-        bytecode,
+    callOperation<SlowOperation>(
         Address(stubInfoGPR, StructureStubInfo::offsetOfSlowOperation()),
-        dst,
         globalObjectGPR, stubInfoGPR, profileGPR, arg3JSR, arg4JSR);
 #else
     VM& vm = this->vm();
@@ -148,9 +145,6 @@
     materializePointerIntoMetadata(bytecode, OpcodeType::Metadata::offsetOfArrayProfile(), profileGPR);
     emitNakedNearCall(vm.getCTIStub(slow_op_get_by_val_prepareCallGenerator).retaggedCode<NoPtrTag>());
     emitNakedNearCall(vm.getCTIStub(checkExceptionGenerator).retaggedCode<NoPtrTag>());
-
-    emitValueProfilingSite(bytecode, returnValueJSR);
-    emitPutVirtualRegister(dst, returnValueJSR);
 #endif // ENABLE(EXTRA_CTI_THUNKS)
 
     gen.reportSlowPathCall(coldPathBegin, Call());
@@ -231,6 +225,7 @@
     addSlowCase();
     m_getByVals.append(gen);
 
+    setFastPathResumePoint();
     emitValueProfilingSite(bytecode, resultJSR);
     emitPutVirtualRegister(dst, resultJSR);
 }
@@ -238,9 +233,6 @@
 void JIT::emitSlow_op_get_private_name(const Instruction* currentInstruction, Vector<SlowCaseEntry>::iterator& iter)
 {
     ASSERT(hasAnySlowCases(iter));
-    auto bytecode = currentInstruction->as<OpGetPrivateName>();
-    VirtualRegister dst = bytecode.m_dst;
-
     linkAllSlowCases(iter);
 
     JITGetByValGenerator& gen = m_getByVals[m_getByValIndex++];
@@ -247,6 +239,7 @@
     Label coldPathBegin = label();
 
 #if !ENABLE(EXTRA_CTI_THUNKS)
+    auto bytecode = currentInstruction->as<OpGetPrivateName>();
     using SlowOperation = decltype(operationGetPrivateNameOptimize);
     constexpr GPRReg globalObjectGPR = preferredArgumentGPR<SlowOperation, 0>();
     constexpr GPRReg stubInfoGPR = preferredArgumentGPR<SlowOperation, 1>();
@@ -257,12 +250,11 @@
     loadConstant(gen.m_unlinkedStubInfoConstantIndex, stubInfoGPR);
     emitGetVirtualRegister(bytecode.m_base, baseJSR);
     emitGetVirtualRegister(bytecode.m_property, propertyJSR);
-    callOperationWithProfile<SlowOperation>(
-        bytecode,
+    callOperation<SlowOperation>(
         Address(stubInfoGPR, StructureStubInfo::offsetOfSlowOperation()),
-        dst,
         globalObjectGPR, stubInfoGPR, baseJSR, propertyJSR);
 #else
+    UNUSED_PARAM(currentInstruction);
     VM& vm = this->vm();
     uint32_t bytecodeOffset = m_bytecodeIndex.offset();
     ASSERT(BytecodeIndex(bytecodeOffset) == m_bytecodeIndex);
@@ -279,11 +271,9 @@
     loadConstant(gen.m_unlinkedStubInfoConstantIndex, stubInfoGPR);
     emitNakedNearCall(vm.getCTIStub(slow_op_get_private_name_prepareCallGenerator).retaggedCode<NoPtrTag>());
     emitNakedNearCall(vm.getCTIStub(checkExceptionGenerator).retaggedCode<NoPtrTag>());
-
-    emitValueProfilingSite(bytecode, returnValueJSR);
-    emitPutVirtualRegister(dst, returnValueJSR);
 #endif // ENABLE(EXTRA_CTI_THUNKS)
 
+    static_assert(BaselineGetByValRegisters::resultJSR == returnValueJSR);
     gen.reportSlowPathCall(coldPathBegin, Call());
 }
 
@@ -1139,8 +1129,10 @@
     gen.generateBaselineDataICFastPath(*this, stubInfoIndex, stubInfoGPR);
     addSlowCase();
     m_getByIds.append(gen);
-    
+
     emitValueProfilingSite(bytecode, resultJSR);
+
+    setFastPathResumePoint();
     emitPutVirtualRegister(resultVReg, resultJSR);
 }
 
@@ -1149,7 +1141,6 @@
     linkAllSlowCases(iter);
 
     auto bytecode = currentInstruction->as<OpTryGetById>();
-    VirtualRegister resultVReg = bytecode.m_dst;
     const Identifier* ident = &(m_unlinkedCodeBlock->identifier(bytecode.m_property));
 
     JITGetByIdGenerator& gen = m_getByIds[m_getByIdIndex++];
@@ -1167,7 +1158,6 @@
     emitGetVirtualRegister(bytecode.m_base, baseJSR);
     callOperation<SlowOperation>(
         Address(stubInfoGPR, StructureStubInfo::offsetOfSlowOperation()),
-        resultVReg,
         globalObjectGPR, stubInfoGPR, baseJSR,
         CacheableIdentifier::createFromIdentifierOwnedByCodeBlock(m_unlinkedCodeBlock, *ident).rawBits());
 #else
@@ -1188,10 +1178,9 @@
     static_assert(std::is_same<decltype(operationTryGetByIdOptimize), decltype(operationGetByIdOptimize)>::value);
     emitNakedNearCall(vm.getCTIStub(slow_op_get_by_id_prepareCallGenerator).retaggedCode<NoPtrTag>());
     emitNakedNearCall(vm.getCTIStub(checkExceptionGenerator).retaggedCode<NoPtrTag>());
-
-    emitPutVirtualRegister(resultVReg, returnValueGPR);
 #endif // ENABLE(EXTRA_CTI_THUNKS)
 
+    static_assert(BaselineGetByIdRegisters::resultJSR == returnValueJSR);
     gen.reportSlowPathCall(coldPathBegin, Call());
 }
 
@@ -1223,6 +1212,7 @@
     addSlowCase();
     m_getByIds.append(gen);
 
+    setFastPathResumePoint();
     emitValueProfilingSite(bytecode, resultJSR);
     emitPutVirtualRegister(resultVReg, resultJSR);
 }
@@ -1232,7 +1222,6 @@
     linkAllSlowCases(iter);
 
     auto bytecode = currentInstruction->as<OpGetByIdDirect>();
-    VirtualRegister resultVReg = bytecode.m_dst;
     const Identifier* ident = &(m_unlinkedCodeBlock->identifier(bytecode.m_property));
 
     JITGetByIdGenerator& gen = m_getByIds[m_getByIdIndex++];
@@ -1248,10 +1237,8 @@
     loadGlobalObject(globalObjectGPR);
     loadConstant(gen.m_unlinkedStubInfoConstantIndex, stubInfoGPR);
     emitGetVirtualRegister(bytecode.m_base, baseJSR);
-    callOperationWithProfile<SlowOperation>(
-        bytecode,
+    callOperation<SlowOperation>(
         Address(stubInfoGPR, StructureStubInfo::offsetOfSlowOperation()),
-        resultVReg,
         globalObjectGPR, stubInfoGPR, baseJSR,
         CacheableIdentifier::createFromIdentifierOwnedByCodeBlock(m_unlinkedCodeBlock, *ident).rawBits());
 #else
@@ -1272,11 +1259,9 @@
     static_assert(std::is_same<decltype(operationGetByIdDirectOptimize), decltype(operationGetByIdOptimize)>::value);
     emitNakedNearCall(vm.getCTIStub(slow_op_get_by_id_prepareCallGenerator).retaggedCode<NoPtrTag>());
     emitNakedNearCall(vm.getCTIStub(checkExceptionGenerator).retaggedCode<NoPtrTag>());
-
-    emitValueProfilingSite(bytecode, returnValueJSR);
-    emitPutVirtualRegister(resultVReg, returnValueJSR);
 #endif // ENABLE(EXTRA_CTI_THUNKS)
 
+    static_assert(BaselineGetByIdRegisters::resultJSR == returnValueJSR);
     gen.reportSlowPathCall(coldPathBegin, Call());
 }
 
@@ -1320,6 +1305,7 @@
     addSlowCase();
     m_getByIds.append(gen);
 
+    setFastPathResumePoint();
     emitValueProfilingSite(bytecode, resultJSR);
     emitPutVirtualRegister(resultVReg, resultJSR);
 }
@@ -1329,7 +1315,6 @@
     linkAllSlowCases(iter);
 
     auto bytecode = currentInstruction->as<OpGetById>();
-    VirtualRegister resultVReg = bytecode.m_dst;
     const Identifier* ident = &(m_unlinkedCodeBlock->identifier(bytecode.m_property));
 
     JITGetByIdGenerator& gen = m_getByIds[m_getByIdIndex++];
@@ -1345,10 +1330,8 @@
     loadGlobalObject(globalObjectGPR);
     loadConstant(gen.m_unlinkedStubInfoConstantIndex, stubInfoGPR);
     emitGetVirtualRegister(bytecode.m_base, baseJSR);
-    callOperationWithProfile<SlowOperation>(
-        bytecode,
+    callOperation<SlowOperation>(
         Address(stubInfoGPR, StructureStubInfo::offsetOfSlowOperation()),
-        resultVReg,
         globalObjectGPR, stubInfoGPR, baseJSR,
         CacheableIdentifier::createFromIdentifierOwnedByCodeBlock(m_unlinkedCodeBlock, *ident).rawBits());
 #else
@@ -1368,11 +1351,9 @@
     move(TrustedImmPtr(CacheableIdentifier::createFromIdentifierOwnedByCodeBlock(m_unlinkedCodeBlock, *ident).rawBits()), propertyGPR);
     emitNakedNearCall(vm.getCTIStub(slow_op_get_by_id_prepareCallGenerator).retaggedCode<NoPtrTag>());
     emitNakedNearCall(vm.getCTIStub(checkExceptionGenerator).retaggedCode<NoPtrTag>());
-
-    emitValueProfilingSite(bytecode, returnValueJSR);
-    emitPutVirtualRegister(resultVReg, returnValueJSR);
 #endif // ENABLE(EXTRA_CTI_THUNKS)
 
+    static_assert(BaselineGetByIdRegisters::resultJSR == returnValueJSR);
     gen.reportSlowPathCall(coldPathBegin, Call());
 }
 
@@ -1409,6 +1390,7 @@
     addSlowCase();
     m_getByIdsWithThis.append(gen);
 
+    setFastPathResumePoint();
     emitValueProfilingSite(bytecode, resultJSR);
     emitPutVirtualRegister(resultVReg, resultJSR);
 }
@@ -1455,7 +1437,6 @@
     linkAllSlowCases(iter);
 
     auto bytecode = currentInstruction->as<OpGetByIdWithThis>();
-    VirtualRegister resultVReg = bytecode.m_dst;
     const Identifier* ident = &(m_unlinkedCodeBlock->identifier(bytecode.m_property));
 
     JITGetByIdWithThisGenerator& gen = m_getByIdsWithThis[m_getByIdWithThisIndex++];
@@ -1473,10 +1454,8 @@
     loadConstant(gen.m_unlinkedStubInfoConstantIndex, stubInfoGPR);
     emitGetVirtualRegister(bytecode.m_base, baseJSR);
     emitGetVirtualRegister(bytecode.m_thisValue, thisJSR);
-    callOperationWithProfile<SlowOperation>(
-        bytecode,
+    callOperation<SlowOperation>(
         Address(stubInfoGPR, StructureStubInfo::offsetOfSlowOperation()),
-        resultVReg,
         globalObjectGPR, stubInfoGPR, baseJSR, thisJSR,
         CacheableIdentifier::createFromIdentifierOwnedByCodeBlock(m_unlinkedCodeBlock, *ident).rawBits());
 #else
@@ -1498,11 +1477,9 @@
     move(TrustedImmPtr(CacheableIdentifier::createFromIdentifierOwnedByCodeBlock(m_unlinkedCodeBlock, *ident).rawBits()), propertyGPR);
     emitNakedNearCall(vm.getCTIStub(slow_op_get_by_id_with_this_prepareCallGenerator).retaggedCode<NoPtrTag>());
     emitNakedNearCall(vm.getCTIStub(checkExceptionGenerator).retaggedCode<NoPtrTag>());
-
-    emitValueProfilingSite(bytecode, returnValueJSR);
-    emitPutVirtualRegister(resultVReg, returnValueJSR);
 #endif // ENABLE(EXTRA_CTI_THUNKS)
 
+    static_assert(BaselineGetByIdWithThisRegisters::resultJSR == returnValueJSR);
     gen.reportSlowPathCall(coldPathBegin, Call());
 }
 
@@ -1708,6 +1685,7 @@
     addSlowCase();
     m_inByIds.append(gen);
 
+    setFastPathResumePoint();
     emitPutVirtualRegister(resultVReg, resultJSR);
 }
 
@@ -1716,7 +1694,6 @@
     linkAllSlowCases(iter);
 
     auto bytecode = currentInstruction->as<OpInById>();
-    VirtualRegister resultVReg = bytecode.m_dst;
     const Identifier* ident = &(m_unlinkedCodeBlock->identifier(bytecode.m_property));
 
     JITInByIdGenerator& gen = m_inByIds[m_inByIdIndex++];
@@ -1734,7 +1711,6 @@
     emitGetVirtualRegister(bytecode.m_base, baseJSR);
     callOperation<SlowOperation>(
         Address(stubInfoGPR, StructureStubInfo::offsetOfSlowOperation()),
-        resultVReg,
         globalObjectGPR, stubInfoGPR, baseJSR,
         CacheableIdentifier::createFromIdentifierOwnedByCodeBlock(m_unlinkedCodeBlock, *ident).rawBits());
 #else
@@ -1757,10 +1733,9 @@
     static_assert(std::is_same<decltype(operationInByIdOptimize), decltype(operationGetByIdOptimize)>::value);
     emitNakedNearCall(vm.getCTIStub(slow_op_get_by_id_prepareCallGenerator).retaggedCode<NoPtrTag>());
     emitNakedNearCall(vm.getCTIStub(checkExceptionGenerator).retaggedCode<NoPtrTag>());
-
-    emitPutVirtualRegister(resultVReg, returnValueGPR);
 #endif // ENABLE(EXTRA_CTI_THUNKS)
 
+    static_assert(BaselineInByIdRegisters::resultJSR == returnValueJSR);
     gen.reportSlowPathCall(coldPathBegin, Call());
 }
 
@@ -1796,6 +1771,7 @@
     addSlowCase();
     m_inByVals.append(gen);
 
+    setFastPathResumePoint();
     emitPutVirtualRegister(dst, resultJSR);
 }
 
@@ -1804,7 +1780,6 @@
     linkAllSlowCases(iter);
 
     auto bytecode = currentInstruction->as<OpInByVal>();
-    VirtualRegister dst = bytecode.m_dst;
 
     JITInByValGenerator& gen = m_inByVals[m_inByValIndex++];
 
@@ -1825,7 +1800,6 @@
     emitGetVirtualRegister(bytecode.m_property, propertyJSR);
     callOperation<SlowOperation>(
         Address(stubInfoGPR, StructureStubInfo::offsetOfSlowOperation()),
-        dst,
         globalObjectGPR, stubInfoGPR, profileGPR, baseJSR, propertyJSR);
 #else
     VM& vm = this->vm();
@@ -1849,10 +1823,9 @@
     static_assert(std::is_same<decltype(operationInByValOptimize), decltype(operationGetByValOptimize)>::value);
     emitNakedNearCall(vm.getCTIStub(slow_op_get_by_val_prepareCallGenerator).retaggedCode<NoPtrTag>());
     emitNakedNearCall(vm.getCTIStub(checkExceptionGenerator).retaggedCode<NoPtrTag>());
-
-    emitPutVirtualRegister(dst, returnValueGPR);
 #endif // ENABLE(EXTRA_CTI_THUNKS)
 
+    static_assert(BaselineInByValRegisters::resultJSR == returnValueJSR);
     gen.reportSlowPathCall(coldPathBegin, Call());
 }
 
@@ -1881,10 +1854,11 @@
     addSlowCase();
     m_inByVals.append(gen);
 
+    setFastPathResumePoint();
     emitPutVirtualRegister(dst, resultJSR);
 }
 
-void JIT::emitHasPrivateSlow(VirtualRegister dst, VirtualRegister base, VirtualRegister property, AccessType type)
+void JIT::emitHasPrivateSlow(VirtualRegister base, VirtualRegister property, AccessType type)
 {
     UNUSED_PARAM(base);
     UNUSED_PARAM(property);
@@ -1906,7 +1880,6 @@
     emitGetVirtualRegister(property, propertyJSR);
     callOperation<SlowOperation>(
         Address(stubInfoGPR, StructureStubInfo::offsetOfSlowOperation()),
-        dst,
         globalObjectGPR, stubInfoGPR, baseJSR, propertyJSR);
 #else
     VM& vm = this->vm();
@@ -1927,10 +1900,9 @@
     static_assert(std::is_same<decltype(operationHasPrivateBrandOptimize), decltype(operationGetPrivateNameOptimize)>::value);
     emitNakedNearCall(vm.getCTIStub(slow_op_get_private_name_prepareCallGenerator).retaggedCode<NoPtrTag>());
     emitNakedNearCall(vm.getCTIStub(checkExceptionGenerator).retaggedCode<NoPtrTag>());
-
-    emitPutVirtualRegister(dst, returnValueGPR);
 #endif // ENABLE(EXTRA_CTI_THUNKS)
 
+    static_assert(BaselineInByValRegisters::resultJSR == returnValueJSR);
     gen.reportSlowPathCall(coldPathBegin, Call());
 }
 
@@ -1945,7 +1917,7 @@
     linkAllSlowCases(iter);
 
     auto bytecode = currentInstruction->as<OpHasPrivateName>();
-    emitHasPrivateSlow(bytecode.m_dst, bytecode.m_base, bytecode.m_property, AccessType::HasPrivateName);
+    emitHasPrivateSlow(bytecode.m_base, bytecode.m_property, AccessType::HasPrivateName);
 }
 
 void JIT::emit_op_has_private_brand(const Instruction* currentInstruction)
@@ -1959,7 +1931,7 @@
     linkAllSlowCases(iter);
 
     auto bytecode = currentInstruction->as<OpHasPrivateBrand>();
-    emitHasPrivateSlow(bytecode.m_dst, bytecode.m_base, bytecode.m_brand, AccessType::HasPrivateBrand);
+    emitHasPrivateSlow(bytecode.m_base, bytecode.m_brand, AccessType::HasPrivateBrand);
 }
 
 void JIT::emit_op_resolve_scope(const Instruction* currentInstruction)
@@ -2923,6 +2895,7 @@
 
     doneCases.link(this);
 
+    setFastPathResumePoint();
     emitValueProfilingSite(bytecode, returnValueJSR);
     emitPutVirtualRegister(dst, returnValueJSR);
 }
@@ -3031,13 +3004,21 @@
         valueNotCell = branchIfNotCell(regT0);
     }
 
-    emitGetVirtualRegister(owner, jsRegT10);
+    constexpr GPRReg arg1GPR = preferredArgumentGPR<decltype(operationWriteBarrierSlowPath), 1>();
+#if USE(JSVALUE64)
+    constexpr JSValueRegs tmpJSR { arg1GPR };
+#elif USE(JSVALUE32_64)
+    constexpr JSValueRegs tmpJSR { regT0, arg1GPR };
+#endif
+    static_assert(noOverlap(regT0, arg1GPR, regT2));
+
+    emitGetVirtualRegister(owner, tmpJSR);
     Jump ownerNotCell;
     if (mode == ShouldFilterBase || mode == ShouldFilterBaseAndValue)
-        ownerNotCell = branchIfNotCell(jsRegT10);
+        ownerNotCell = branchIfNotCell(tmpJSR);
 
-    Jump ownerIsRememberedOrInEden = barrierBranch(vm(), jsRegT10.payloadGPR(), regT2);
-    callOperationNoExceptionCheck(operationWriteBarrierSlowPath, &vm(), jsRegT10.payloadGPR());
+    Jump ownerIsRememberedOrInEden = barrierBranch(vm(), tmpJSR.payloadGPR(), regT2);
+    callOperationNoExceptionCheck(operationWriteBarrierSlowPath, &vm(), tmpJSR.payloadGPR());
     ownerIsRememberedOrInEden.link(this);
 
     if (mode == ShouldFilterBase || mode == ShouldFilterBaseAndValue)
_______________________________________________
webkit-changes mailing list
[email protected]
https://lists.webkit.org/mailman/listinfo/webkit-changes

Reply via email to