Diff
Modified: trunk/Source/_javascript_Core/ChangeLog (278575 => 278576)
--- trunk/Source/_javascript_Core/ChangeLog 2021-06-07 22:11:29 UTC (rev 278575)
+++ trunk/Source/_javascript_Core/ChangeLog 2021-06-07 22:51:25 UTC (rev 278576)
@@ -1,3 +1,97 @@
+2021-06-07 Mark Lam <[email protected]>
+
+ Put the Baseline JIT prologue and op_loop_hint code in JIT thunks.
+ https://bugs.webkit.org/show_bug.cgi?id=226375
+
+ Reviewed by Keith Miller and Robin Morisset.
+
+ Baseline JIT prologue code varies in behavior based on several variables. These
+ variables include (1) whether the prologue does any arguments value profiling,
+ (2) whether the prologue is for a constructor, and (3) whether the compiled
+ CodeBlock will have such a large frame that it is greater than the stack reserved
+ zone (aka red zone) which would require additional stack check logic.
+
+ The pre-existing code would generate specialized code based on these (and other
+ variables). In converting to using thunks for the prologue, we opt not to
+ convert these specializations into runtime checks. Instead, the implementation
+ uses 1 of 8 possible specialized thunks to reduce the need to pass arguments for
+ runtime checks. The only needed argument passed to the prologue thunks is the
+ codeBlock pointer.
+
+ There are 8 possible thunks because we specialize based on 3 variables:
+ 1. doesProfiling
+ 2. isConstructor
+ 3. hasHugeFrame
+
+ 2**3 yields 8 permutations of prologue thunk specializations.
+
+ Similarly, there are also 8 analogous arity fixup prologues that work similarly.
+
+ The op_loop_hint thunk only takes 1 runtime argument: the bytecode offset.
+
+ We've tried doing the loop_hint optimization check in the thunk (in order to move
+ both the fast and slow path into the thunk for maximum space savings). However,
+ this seems to have some slight negative impact on benchmark performance. We ended
+ up just keeping the fast path and instead have the slow path call a thunk to do
+ its work. This realizes the bulk of the size savings without the perf impact.
+
+ This patch also optimizes op_enter a bit more by eliminating the need to pass any
+ arguments to the thunk. The thunk previously took 2 arguments: localsToInit and
+ canBeOptimized. localsToInit is now computed in the thunk at runtime, and
+ canBeOptimized is used as a specialization argument to generate 2 variants of the
+ op_enter thunk: op_enter_canBeOptimized_Generator and op_enter_cannotBeOptimized_Generator,
+ thereby removing the need to pass it as a runtime argument.
+
+ LinkBuffer size results (from a single run of Speedometer2):
+
+ BaselineJIT: 93319628 (88.996532 MB) => 83851824 (79.967331 MB) 0.90x
+ ExtraCTIThunk: 5992 (5.851562 KB) => 6984 (6.820312 KB) 1.17x
+ ...
+ Total: 197530008 (188.379295 MB) => 188459444 (179.728931 MB) 0.95x
+
+ Speedometer2 and JetStream2 results (as measured on an M1 Mac) are neutral.
+
+ * assembler/AbstractMacroAssembler.h:
+ (JSC::AbstractMacroAssembler::untagReturnAddressWithoutExtraValidation):
+ * assembler/MacroAssemblerARM64E.h:
+ (JSC::MacroAssemblerARM64E::untagReturnAddress):
+ (JSC::MacroAssemblerARM64E::untagReturnAddressWithoutExtraValidation):
+ * assembler/MacroAssemblerARMv7.h:
+ (JSC::MacroAssemblerARMv7::branchAdd32):
+ * assembler/MacroAssemblerMIPS.h:
+ (JSC::MacroAssemblerMIPS::branchAdd32):
+ * bytecode/CodeBlock.h:
+ (JSC::CodeBlock::offsetOfNumCalleeLocals):
+ (JSC::CodeBlock::offsetOfNumVars):
+ (JSC::CodeBlock::offsetOfArgumentValueProfiles):
+ (JSC::CodeBlock::offsetOfShouldAlwaysBeInlined):
+ * jit/AssemblyHelpers.h:
+ (JSC::AssemblyHelpers::emitSaveCalleeSavesFor):
+ (JSC::AssemblyHelpers::emitSaveCalleeSavesForBaselineJIT):
+ (JSC::AssemblyHelpers::emitRestoreCalleeSavesForBaselineJIT):
+ * jit/JIT.cpp:
+ (JSC::JIT::compileAndLinkWithoutFinalizing):
+ (JSC::JIT::prologueGenerator):
+ (JSC::JIT::arityFixupPrologueGenerator):
+ (JSC::JIT::privateCompileExceptionHandlers):
+ * jit/JIT.h:
+ * jit/JITInlines.h:
+ (JSC::JIT::emitNakedNearCall):
+ * jit/JITOpcodes.cpp:
+ (JSC::JIT::op_ret_handlerGenerator):
+ (JSC::JIT::emit_op_enter):
+ (JSC::JIT::op_enter_Generator):
+ (JSC::JIT::op_enter_canBeOptimized_Generator):
+ (JSC::JIT::op_enter_cannotBeOptimized_Generator):
+ (JSC::JIT::emit_op_loop_hint):
+ (JSC::JIT::emitSlow_op_loop_hint):
+ (JSC::JIT::op_loop_hint_Generator):
+ (JSC::JIT::op_enter_handlerGenerator): Deleted.
+ * jit/JITOpcodes32_64.cpp:
+ (JSC::JIT::emit_op_enter):
+ * jit/ThunkGenerators.cpp:
+ (JSC::popThunkStackPreservesAndHandleExceptionGenerator):
+
2021-06-07 Robin Morisset <[email protected]>
Optimize compareStrictEq when neither side is a double and at least one is neither a string nor a BigInt
Modified: trunk/Source/_javascript_Core/assembler/AbstractMacroAssembler.h (278575 => 278576)
--- trunk/Source/_javascript_Core/assembler/AbstractMacroAssembler.h 2021-06-07 22:11:29 UTC (rev 278575)
+++ trunk/Source/_javascript_Core/assembler/AbstractMacroAssembler.h 2021-06-07 22:51:25 UTC (rev 278576)
@@ -1003,6 +1003,7 @@
ALWAYS_INLINE void tagReturnAddress() { }
ALWAYS_INLINE void untagReturnAddress(RegisterID = RegisterID::InvalidGPRReg) { }
+ ALWAYS_INLINE void untagReturnAddressWithoutExtraValidation() { }
ALWAYS_INLINE void tagPtr(PtrTag, RegisterID) { }
ALWAYS_INLINE void tagPtr(RegisterID, RegisterID) { }
Modified: trunk/Source/_javascript_Core/assembler/MacroAssemblerARM64E.h (278575 => 278576)
--- trunk/Source/_javascript_Core/assembler/MacroAssemblerARM64E.h 2021-06-07 22:11:29 UTC (rev 278575)
+++ trunk/Source/_javascript_Core/assembler/MacroAssemblerARM64E.h 2021-06-07 22:51:25 UTC (rev 278576)
@@ -54,10 +54,15 @@
ALWAYS_INLINE void untagReturnAddress(RegisterID scratch = InvalidGPR)
{
- untagPtr(ARM64Registers::sp, ARM64Registers::lr);
+ untagReturnAddressWithoutExtraValidation();
validateUntaggedPtr(ARM64Registers::lr, scratch);
}
+ ALWAYS_INLINE void untagReturnAddressWithoutExtraValidation()
+ {
+ untagPtr(ARM64Registers::sp, ARM64Registers::lr);
+ }
+
ALWAYS_INLINE void tagPtr(PtrTag tag, RegisterID target)
{
auto tagGPR = getCachedDataTempRegisterIDAndInvalidate();
Modified: trunk/Source/_javascript_Core/assembler/MacroAssemblerARMv7.h (278575 => 278576)
--- trunk/Source/_javascript_Core/assembler/MacroAssemblerARMv7.h 2021-06-07 22:11:29 UTC (rev 278575)
+++ trunk/Source/_javascript_Core/assembler/MacroAssemblerARMv7.h 2021-06-07 22:51:25 UTC (rev 278576)
@@ -1795,6 +1795,23 @@
return branchAdd32(cond, dest, imm, dest);
}
+ Jump branchAdd32(ResultCondition cond, TrustedImm32 imm, Address dest)
+ {
+ load32(dest, dataTempRegister);
+
+ // Do the add.
+ ARMThumbImmediate armImm = ARMThumbImmediate::makeEncodedImm(imm.m_value);
+ if (armImm.isValid())
+ m_assembler.add_S(dataTempRegister, dataTempRegister, armImm);
+ else {
+ move(imm, addressTempRegister);
+ m_assembler.add_S(dataTempRegister, dataTempRegister, addressTempRegister);
+ }
+
+ store32(dataTempRegister, dest);
+ return Jump(makeBranch(cond));
+ }
+
Jump branchAdd32(ResultCondition cond, TrustedImm32 imm, AbsoluteAddress dest)
{
// Move the high bits of the address into addressTempRegister,
Modified: trunk/Source/_javascript_Core/assembler/MacroAssemblerMIPS.h (278575 => 278576)
--- trunk/Source/_javascript_Core/assembler/MacroAssemblerMIPS.h 2021-06-07 22:11:29 UTC (rev 278575)
+++ trunk/Source/_javascript_Core/assembler/MacroAssemblerMIPS.h 2021-06-07 22:51:25 UTC (rev 278576)
@@ -2310,6 +2310,111 @@
return Jump();
}
+ Jump branchAdd32(ResultCondition cond, TrustedImm32 imm, ImplicitAddress destAddress)
+ {
+ bool useAddrTempRegister = !(destAddress.offset >= -32768 && destAddress.offset <= 32767
+ && !m_fixedWidth);
+
+ if (useAddrTempRegister) {
+ m_assembler.lui(addrTempRegister, (destAddress.offset + 0x8000) >> 16);
+ m_assembler.addu(addrTempRegister, addrTempRegister, destAddress.base);
+ }
+
+ auto loadDest = [&] (RegisterID dest) {
+ if (useAddrTempRegister)
+ m_assembler.lw(dest, addrTempRegister, destAddress.offset);
+ else
+ m_assembler.lw(dest, destAddress.base, destAddress.offset);
+ };
+
+ auto storeDest = [&] (RegisterID src) {
+ if (useAddrTempRegister)
+ m_assembler.sw(src, addrTempRegister, destAddress.offset);
+ else
+ m_assembler.sw(src, destAddress.base, destAddress.offset);
+ };
+
+ ASSERT((cond == Overflow) || (cond == Signed) || (cond == PositiveOrZero) || (cond == Zero) || (cond == NonZero));
+ if (cond == Overflow) {
+ if (m_fixedWidth) {
+ /*
+ load dest, dataTemp
+ move imm, immTemp
+ xor cmpTemp, dataTemp, immTemp
+ addu dataTemp, dataTemp, immTemp
+ store dataTemp, dest
+ bltz cmpTemp, No_overflow # diff sign bit -> no overflow
+ xor cmpTemp, dataTemp, immTemp
+ bgez cmpTemp, No_overflow # same sign big -> no overflow
+ nop
+ b Overflow
+ nop
+ b No_overflow
+ nop
+ nop
+ nop
+ No_overflow:
+ */
+ loadDest(dataTempRegister);
+ move(imm, immTempRegister);
+ m_assembler.xorInsn(cmpTempRegister, dataTempRegister, immTempRegister);
+ m_assembler.addu(dataTempRegister, dataTempRegister, immTempRegister);
+ storeDest(dataTempRegister);
+ m_assembler.bltz(cmpTempRegister, 9);
+ m_assembler.xorInsn(cmpTempRegister, dataTempRegister, immTempRegister);
+ m_assembler.bgez(cmpTempRegister, 7);
+ m_assembler.nop();
+ } else {
+ loadDest(dataTempRegister);
+ if (imm.m_value >= 0 && imm.m_value <= 32767) {
+ move(dataTempRegister, cmpTempRegister);
+ m_assembler.addiu(dataTempRegister, dataTempRegister, imm.m_value);
+ m_assembler.bltz(cmpTempRegister, 9);
+ storeDest(dataTempRegister);
+ m_assembler.bgez(dataTempRegister, 7);
+ m_assembler.nop();
+ } else if (imm.m_value >= -32768 && imm.m_value < 0) {
+ move(dataTempRegister, cmpTempRegister);
+ m_assembler.addiu(dataTempRegister, dataTempRegister, imm.m_value);
+ m_assembler.bgez(cmpTempRegister, 9);
+ storeDest(dataTempRegister);
+ m_assembler.bltz(cmpTempRegister, 7);
+ m_assembler.nop();
+ } else {
+ move(imm, immTempRegister);
+ m_assembler.xorInsn(cmpTempRegister, dataTempRegister, immTempRegister);
+ m_assembler.addu(dataTempRegister, dataTempRegister, immTempRegister);
+ m_assembler.bltz(cmpTempRegister, 10);
+ storeDest(dataTempRegister);
+ m_assembler.xorInsn(cmpTempRegister, dataTempRegister, immTempRegister);
+ m_assembler.bgez(cmpTempRegister, 7);
+ m_assembler.nop();
+ }
+ }
+ return jump();
+ }
+ move(imm, immTempRegister);
+ loadDest(dataTempRegister);
+ add32(immTempRegister, dataTempRegister);
+ storeDest(dataTempRegister);
+ if (cond == Signed) {
+ // Check if dest is negative.
+ m_assembler.slt(cmpTempRegister, dataTempRegister, MIPSRegisters::zero);
+ return branchNotEqual(cmpTempRegister, MIPSRegisters::zero);
+ }
+ if (cond == PositiveOrZero) {
+ // Check if dest is not negative.
+ m_assembler.slt(cmpTempRegister, dataTempRegister, MIPSRegisters::zero);
+ return branchEqual(cmpTempRegister, MIPSRegisters::zero);
+ }
+ if (cond == Zero)
+ return branchEqual(dataTempRegister, MIPSRegisters::zero);
+ if (cond == NonZero)
+ return branchNotEqual(dataTempRegister, MIPSRegisters::zero);
+ ASSERT(0);
+ return Jump();
+ }
+
Jump branchMul32(ResultCondition cond, RegisterID src1, RegisterID src2, RegisterID dest)
{
ASSERT((cond == Overflow) || (cond == Signed) || (cond == Zero) || (cond == NonZero));
Modified: trunk/Source/_javascript_Core/bytecode/CodeBlock.h (278575 => 278576)
--- trunk/Source/_javascript_Core/bytecode/CodeBlock.h 2021-06-07 22:11:29 UTC (rev 278575)
+++ trunk/Source/_javascript_Core/bytecode/CodeBlock.h 2021-06-07 22:51:25 UTC (rev 278576)
@@ -169,7 +169,10 @@
unsigned numTmps() const { return m_unlinkedCode->hasCheckpoints() * maxNumCheckpointTmps; }
unsigned* addressOfNumParameters() { return &m_numParameters; }
+
+ static ptrdiff_t offsetOfNumCalleeLocals() { return OBJECT_OFFSETOF(CodeBlock, m_numCalleeLocals); }
static ptrdiff_t offsetOfNumParameters() { return OBJECT_OFFSETOF(CodeBlock, m_numParameters); }
+ static ptrdiff_t offsetOfNumVars() { return OBJECT_OFFSETOF(CodeBlock, m_numVars); }
CodeBlock* alternative() const { return static_cast<CodeBlock*>(m_alternative.get()); }
void setAlternative(VM&, CodeBlock*);
@@ -486,6 +489,8 @@
return result;
}
+ static ptrdiff_t offsetOfArgumentValueProfiles() { return OBJECT_OFFSETOF(CodeBlock, m_argumentValueProfiles); }
+
ValueProfile& valueProfileForBytecodeIndex(BytecodeIndex);
SpeculatedType valueProfilePredictionForBytecodeIndex(const ConcurrentJSLocker&, BytecodeIndex);
@@ -819,7 +824,7 @@
}
bool wasCompiledWithDebuggingOpcodes() const { return m_unlinkedCode->wasCompiledWithDebuggingOpcodes(); }
-
+
// This is intentionally public; it's the responsibility of anyone doing any
// of the following to hold the lock:
//
@@ -906,6 +911,7 @@
static ptrdiff_t offsetOfMetadataTable() { return OBJECT_OFFSETOF(CodeBlock, m_metadata); }
static ptrdiff_t offsetOfInstructionsRawPointer() { return OBJECT_OFFSETOF(CodeBlock, m_instructionsRawPointer); }
+ static ptrdiff_t offsetOfShouldAlwaysBeInlined() { return OBJECT_OFFSETOF(CodeBlock, m_shouldAlwaysBeInlined); }
bool loopHintsAreEligibleForFuzzingEarlyReturn()
{
Modified: trunk/Source/_javascript_Core/jit/AssemblyHelpers.h (278575 => 278576)
--- trunk/Source/_javascript_Core/jit/AssemblyHelpers.h 2021-06-07 22:11:29 UTC (rev 278575)
+++ trunk/Source/_javascript_Core/jit/AssemblyHelpers.h 2021-06-07 22:51:25 UTC (rev 278576)
@@ -326,6 +326,11 @@
ASSERT(codeBlock);
const RegisterAtOffsetList* calleeSaves = codeBlock->calleeSaveRegisters();
+ emitSaveCalleeSavesFor(calleeSaves);
+ }
+
+ void emitSaveCalleeSavesFor(const RegisterAtOffsetList* calleeSaves)
+ {
RegisterSet dontSaveRegisters = RegisterSet(RegisterSet::stackRegisters(), RegisterSet::allFPRs());
unsigned registerCount = calleeSaves->size();
@@ -399,6 +404,11 @@
emitSaveCalleeSavesFor(codeBlock());
}
+ void emitSaveCalleeSavesForBaselineJIT()
+ {
+ emitSaveCalleeSavesFor(&RegisterAtOffsetList::llintBaselineCalleeSaveRegisters());
+ }
+
void emitSaveThenMaterializeTagRegisters()
{
#if USE(JSVALUE64)
@@ -417,6 +427,11 @@
emitRestoreCalleeSavesFor(codeBlock());
}
+ void emitRestoreCalleeSavesForBaselineJIT()
+ {
+ emitRestoreCalleeSavesFor(&RegisterAtOffsetList::llintBaselineCalleeSaveRegisters());
+ }
+
void emitRestoreSavedTagRegisters()
{
#if USE(JSVALUE64)
Modified: trunk/Source/_javascript_Core/jit/JIT.cpp (278575 => 278576)
--- trunk/Source/_javascript_Core/jit/JIT.cpp 2021-06-07 22:11:29 UTC (rev 278575)
+++ trunk/Source/_javascript_Core/jit/JIT.cpp 2021-06-07 22:51:25 UTC (rev 278576)
@@ -55,6 +55,20 @@
static constexpr const bool verbose = false;
}
+#if ENABLE(EXTRA_CTI_THUNKS)
+#if CPU(ARM64) || (CPU(X86_64) && !OS(WINDOWS))
+// These are supported ports.
+#else
+// This is a courtesy reminder (and warning) that the implementation of EXTRA_CTI_THUNKS can
+// use up to 6 argument registers and/or 6/7 temp registers, and make use of ARM64 like
+// features. Hence, it may not work for many other ports without significant work. If you
+// plan on adding EXTRA_CTI_THUNKS support for your port, please remember to search the
+// EXTRA_CTI_THUNKS code for CPU(ARM64) and CPU(X86_64) conditional code, and add support
+// for your port there as well.
+#error "unsupported architecture"
+#endif
+#endif // ENABLE(EXTRA_CTI_THUNKS)
+
Seconds totalBaselineCompileTime;
Seconds totalDFGCompileTime;
Seconds totalFTLCompileTime;
@@ -83,7 +97,7 @@
{
}
-#if ENABLE(DFG_JIT)
+#if ENABLE(DFG_JIT) && !ENABLE(EXTRA_CTI_THUNKS)
void JIT::emitEnterOptimizationCheck()
{
if (!canBeOptimized())
@@ -101,7 +115,7 @@
farJump(returnValueGPR, GPRInfo::callFrameRegister);
skipOptimize.link(this);
}
-#endif
+#endif // ENABLE(DFG_JIT) && !ENABLE(EXTRA_CTI_THUNKS)(
void JIT::emitNotifyWrite(WatchpointSet* set)
{
@@ -682,6 +696,32 @@
#endif
}
+static inline unsigned prologueGeneratorSelector(bool doesProfiling, bool isConstructor, bool hasHugeFrame)
+{
+ return doesProfiling << 2 | isConstructor << 1 | hasHugeFrame << 0;
+}
+
+#define FOR_EACH_NON_PROFILING_PROLOGUE_GENERATOR(v) \
+ v(!doesProfiling, !isConstructor, !hasHugeFrame, prologueGenerator0, arityFixup_prologueGenerator0) \
+ v(!doesProfiling, !isConstructor, hasHugeFrame, prologueGenerator1, arityFixup_prologueGenerator1) \
+ v(!doesProfiling, isConstructor, !hasHugeFrame, prologueGenerator2, arityFixup_prologueGenerator2) \
+ v(!doesProfiling, isConstructor, hasHugeFrame, prologueGenerator3, arityFixup_prologueGenerator3)
+
+#if ENABLE(DFG_JIT)
+#define FOR_EACH_PROFILING_PROLOGUE_GENERATOR(v) \
+ v( doesProfiling, !isConstructor, !hasHugeFrame, prologueGenerator4, arityFixup_prologueGenerator4) \
+ v( doesProfiling, !isConstructor, hasHugeFrame, prologueGenerator5, arityFixup_prologueGenerator5) \
+ v( doesProfiling, isConstructor, !hasHugeFrame, prologueGenerator6, arityFixup_prologueGenerator6) \
+ v( doesProfiling, isConstructor, hasHugeFrame, prologueGenerator7, arityFixup_prologueGenerator7)
+
+#else // not ENABLE(DFG_JIT)
+#define FOR_EACH_PROFILING_PROLOGUE_GENERATOR(v)
+#endif // ENABLE(DFG_JIT)
+
+#define FOR_EACH_PROLOGUE_GENERATOR(v) \
+ FOR_EACH_NON_PROFILING_PROLOGUE_GENERATOR(v) \
+ FOR_EACH_PROFILING_PROLOGUE_GENERATOR(v)
+
void JIT::compileAndLinkWithoutFinalizing(JITCompilationEffort effort)
{
DFG::CapabilityLevel level = m_codeBlock->capabilityLevel();
@@ -750,6 +790,8 @@
nop();
emitFunctionPrologue();
+
+#if !ENABLE(EXTRA_CTI_THUNKS)
emitPutToCallFrameHeader(m_codeBlock, CallFrameSlot::codeBlock);
Label beginLabel(this);
@@ -771,11 +813,10 @@
if (m_codeBlock->codeType() == FunctionCode) {
ASSERT(!m_bytecodeIndex);
if (shouldEmitProfiling()) {
- for (unsigned argument = 0; argument < m_codeBlock->numParameters(); ++argument) {
- // If this is a constructor, then we want to put in a dummy profiling site (to
- // keep things consistent) but we don't actually want to record the dummy value.
- if (m_codeBlock->isConstructor() && !argument)
- continue;
+ // If this is a constructor, then we want to put in a dummy profiling site (to
+ // keep things consistent) but we don't actually want to record the dummy value.
+ unsigned startArgument = m_codeBlock->isConstructor() ? 1 : 0;
+ for (unsigned argument = startArgument; argument < m_codeBlock->numParameters(); ++argument) {
int offset = CallFrame::argumentOffsetIncludingThis(argument) * static_cast<int>(sizeof(Register));
#if USE(JSVALUE64)
JSValueRegs resultRegs = JSValueRegs(regT0);
@@ -789,7 +830,34 @@
}
}
}
-
+#else // ENABLE(EXTRA_CTI_THUNKS)
+ constexpr GPRReg codeBlockGPR = regT7;
+ ASSERT(!m_bytecodeIndex);
+
+ int frameTopOffset = stackPointerOffsetFor(m_codeBlock) * sizeof(Register);
+ unsigned maxFrameSize = -frameTopOffset;
+
+ bool doesProfiling = (m_codeBlock->codeType() == FunctionCode) && shouldEmitProfiling();
+ bool isConstructor = m_codeBlock->isConstructor();
+ bool hasHugeFrame = maxFrameSize > Options::reservedZoneSize();
+
+ static constexpr ThunkGenerator generators[] = {
+#define USE_PROLOGUE_GENERATOR(doesProfiling, isConstructor, hasHugeFrame, name, arityFixupName) name,
+ FOR_EACH_PROLOGUE_GENERATOR(USE_PROLOGUE_GENERATOR)
+#undef USE_PROLOGUE_GENERATOR
+ };
+ static constexpr unsigned numberOfGenerators = sizeof(generators) / sizeof(generators[0]);
+
+ move(TrustedImmPtr(m_codeBlock), codeBlockGPR);
+
+ unsigned generatorSelector = prologueGeneratorSelector(doesProfiling, isConstructor, hasHugeFrame);
+ RELEASE_ASSERT(generatorSelector < numberOfGenerators);
+ auto generator = generators[generatorSelector];
+ emitNakedNearCall(vm().getCTIStub(generator).retaggedCode<NoPtrTag>());
+
+ Label bodyLabel(this);
+#endif // !ENABLE(EXTRA_CTI_THUNKS)
+
RELEASE_ASSERT(!JITCode::isJIT(m_codeBlock->jitType()));
if (UNLIKELY(sizeMarker))
@@ -803,16 +871,19 @@
m_disassembler->setEndOfSlowPath(label());
m_pcToCodeOriginMapBuilder.appendItem(label(), PCToCodeOriginMapBuilder::defaultCodeOrigin());
+#if !ENABLE(EXTRA_CTI_THUNKS)
stackOverflow.link(this);
m_bytecodeIndex = BytecodeIndex(0);
if (maxFrameExtentForSlowPathCall)
addPtr(TrustedImm32(-static_cast<int32_t>(maxFrameExtentForSlowPathCall)), stackPointerRegister);
callOperationWithCallFrameRollbackOnException(operationThrowStackOverflowError, m_codeBlock);
+#endif
// If the number of parameters is 1, we never require arity fixup.
bool requiresArityFixup = m_codeBlock->m_numParameters != 1;
if (m_codeBlock->codeType() == FunctionCode && requiresArityFixup) {
m_arityCheck = label();
+#if !ENABLE(EXTRA_CTI_THUNKS)
store8(TrustedImm32(0), &m_codeBlock->m_shouldAlwaysBeInlined);
emitFunctionPrologue();
emitPutToCallFrameHeader(m_codeBlock, CallFrameSlot::codeBlock);
@@ -831,17 +902,42 @@
move(returnValueGPR, GPRInfo::argumentGPR0);
emitNakedNearCall(m_vm->getCTIStub(arityFixupGenerator).retaggedCode<NoPtrTag>());
+ jump(beginLabel);
+
+#else // ENABLE(EXTRA_CTI_THUNKS)
+ emitFunctionPrologue();
+
+ static_assert(codeBlockGPR == regT7);
+ ASSERT(!m_bytecodeIndex);
+
+ static constexpr ThunkGenerator generators[] = {
+#define USE_PROLOGUE_GENERATOR(doesProfiling, isConstructor, hasHugeFrame, name, arityFixupName) arityFixupName,
+ FOR_EACH_PROLOGUE_GENERATOR(USE_PROLOGUE_GENERATOR)
+#undef USE_PROLOGUE_GENERATOR
+ };
+ static constexpr unsigned numberOfGenerators = sizeof(generators) / sizeof(generators[0]);
+
+ move(TrustedImmPtr(m_codeBlock), codeBlockGPR);
+
+ RELEASE_ASSERT(generatorSelector < numberOfGenerators);
+ auto generator = generators[generatorSelector];
+ RELEASE_ASSERT(generator);
+ emitNakedNearCall(vm().getCTIStub(generator).retaggedCode<NoPtrTag>());
+
+ jump(bodyLabel);
+#endif // !ENABLE(EXTRA_CTI_THUNKS)
+
#if ASSERT_ENABLED
m_bytecodeIndex = BytecodeIndex(); // Reset this, in order to guard its use with ASSERTs.
#endif
-
- jump(beginLabel);
} else
m_arityCheck = entryLabel; // Never require arity fixup.
ASSERT(m_jmpTable.isEmpty());
+#if !ENABLE(EXTRA_CTI_THUNKS)
privateCompileExceptionHandlers();
+#endif
if (m_disassembler)
m_disassembler->setEndOfCode(label());
@@ -851,6 +947,241 @@
link();
}
+#if ENABLE(EXTRA_CTI_THUNKS)
+MacroAssemblerCodeRef<JITThunkPtrTag> JIT::prologueGenerator(VM& vm, bool doesProfiling, bool isConstructor, bool hasHugeFrame, const char* thunkName)
+{
+ // This function generates the Baseline JIT's prologue code. It is not useable by other tiers.
+ constexpr GPRReg codeBlockGPR = regT7; // incoming.
+
+ constexpr int virtualRegisterSize = static_cast<int>(sizeof(Register));
+ constexpr int virtualRegisterSizeShift = 3;
+ static_assert((1 << virtualRegisterSizeShift) == virtualRegisterSize);
+
+ tagReturnAddress();
+
+ storePtr(codeBlockGPR, addressFor(CallFrameSlot::codeBlock));
+
+ load32(Address(codeBlockGPR, CodeBlock::offsetOfNumCalleeLocals()), regT1);
+ if constexpr (maxFrameExtentForSlowPathCallInRegisters)
+ add32(TrustedImm32(maxFrameExtentForSlowPathCallInRegisters), regT1);
+ lshift32(TrustedImm32(virtualRegisterSizeShift), regT1);
+ neg64(regT1);
+#if ASSERT_ENABLED
+ Probe::Function probeFunction = [] (Probe::Context& context) {
+ CodeBlock* codeBlock = context.fp<CallFrame*>()->codeBlock();
+ int64_t frameTopOffset = stackPointerOffsetFor(codeBlock) * sizeof(Register);
+ RELEASE_ASSERT(context.gpr<intptr_t>(regT1) == frameTopOffset);
+ };
+ probe(tagCFunctionPtr<JITProbePtrTag>(probeFunction), nullptr);
+#endif
+
+ addPtr(callFrameRegister, regT1);
+
+ JumpList stackOverflow;
+ if (hasHugeFrame)
+ stackOverflow.append(branchPtr(Above, regT1, callFrameRegister));
+ stackOverflow.append(branchPtr(Above, AbsoluteAddress(vm.addressOfSoftStackLimit()), regT1));
+
+ // We'll be imminently returning with a `retab` (ARM64E's return with authentication
+ // using the B key) in the normal path (see MacroAssemblerARM64E's implementation of
+ // ret()), which will do validation. So, extra validation here is redundant and unnecessary.
+ untagReturnAddressWithoutExtraValidation();
+#if CPU(X86_64)
+ pop(regT2); // Save the return address.
+#endif
+ move(regT1, stackPointerRegister);
+ tagReturnAddress();
+ checkStackPointerAlignment();
+#if CPU(X86_64)
+ push(regT2); // Restore the return address.
+#endif
+
+ emitSaveCalleeSavesForBaselineJIT();
+ emitMaterializeTagCheckRegisters();
+
+ if (doesProfiling) {
+ constexpr GPRReg argumentValueProfileGPR = regT6;
+ constexpr GPRReg numParametersGPR = regT5;
+ constexpr GPRReg argumentGPR = regT4;
+
+ load32(Address(codeBlockGPR, CodeBlock::offsetOfNumParameters()), numParametersGPR);
+ loadPtr(Address(codeBlockGPR, CodeBlock::offsetOfArgumentValueProfiles()), argumentValueProfileGPR);
+ if (isConstructor)
+ addPtr(TrustedImm32(sizeof(ValueProfile)), argumentValueProfileGPR);
+
+ int startArgument = CallFrameSlot::thisArgument + (isConstructor ? 1 : 0);
+ int startArgumentOffset = startArgument * virtualRegisterSize;
+ move(TrustedImm64(startArgumentOffset), argumentGPR);
+
+ add32(TrustedImm32(static_cast<int>(CallFrameSlot::thisArgument)), numParametersGPR);
+ lshift32(TrustedImm32(virtualRegisterSizeShift), numParametersGPR);
+
+ addPtr(callFrameRegister, argumentGPR);
+ addPtr(callFrameRegister, numParametersGPR);
+
+ Label loopStart(this);
+ Jump done = branchPtr(AboveOrEqual, argumentGPR, numParametersGPR);
+ {
+ load64(Address(argumentGPR), regT0);
+ store64(regT0, Address(argumentValueProfileGPR, OBJECT_OFFSETOF(ValueProfile, m_buckets)));
+
+ // The argument ValueProfiles are stored in a FixedVector. Hence, the
+ // address of the next profile can be trivially computed with an increment.
+ addPtr(TrustedImm32(sizeof(ValueProfile)), argumentValueProfileGPR);
+ addPtr(TrustedImm32(virtualRegisterSize), argumentGPR);
+ jump().linkTo(loopStart, this);
+ }
+ done.link(this);
+ }
+ ret();
+
+ stackOverflow.link(this);
+#if CPU(X86_64)
+ addPtr(TrustedImm32(1 * sizeof(CPURegister)), stackPointerRegister); // discard return address.
+#endif
+
+ uint32_t locationBits = CallSiteIndex(0).bits();
+ store32(TrustedImm32(locationBits), tagFor(CallFrameSlot::argumentCountIncludingThis));
+
+ if (maxFrameExtentForSlowPathCall)
+ addPtr(TrustedImm32(-static_cast<int32_t>(maxFrameExtentForSlowPathCall)), stackPointerRegister);
+
+ setupArguments<decltype(operationThrowStackOverflowError)>(codeBlockGPR);
+ prepareCallOperation(vm);
+ MacroAssembler::Call operationCall = call(OperationPtrTag);
+ Jump handleExceptionJump = jump();
+
+ auto handler = vm.getCTIStub(handleExceptionWithCallFrameRollbackGenerator);
+
+ LinkBuffer patchBuffer(*this, GLOBAL_THUNK_ID, LinkBuffer::Profile::ExtraCTIThunk);
+ patchBuffer.link(operationCall, FunctionPtr<OperationPtrTag>(operationThrowStackOverflowError));
+ patchBuffer.link(handleExceptionJump, CodeLocationLabel(handler.retaggedCode<NoPtrTag>()));
+ return FINALIZE_CODE(patchBuffer, JITThunkPtrTag, thunkName);
+}
+
+static constexpr bool doesProfiling = true;
+static constexpr bool isConstructor = true;
+static constexpr bool hasHugeFrame = true;
+
+#define DEFINE_PROGLOGUE_GENERATOR(doesProfiling, isConstructor, hasHugeFrame, name, arityFixupName) \
+ MacroAssemblerCodeRef<JITThunkPtrTag> JIT::name(VM& vm) \
+ { \
+ JIT jit(vm); \
+ return jit.prologueGenerator(vm, doesProfiling, isConstructor, hasHugeFrame, "Baseline: " #name); \
+ }
+
+FOR_EACH_PROLOGUE_GENERATOR(DEFINE_PROGLOGUE_GENERATOR)
+#undef DEFINE_PROGLOGUE_GENERATOR
+
+MacroAssemblerCodeRef<JITThunkPtrTag> JIT::arityFixupPrologueGenerator(VM& vm, bool isConstructor, ThunkGenerator normalPrologueGenerator, const char* thunkName)
+{
+ // This function generates the Baseline JIT's prologue code. It is not useable by other tiers.
+ constexpr GPRReg codeBlockGPR = regT7; // incoming.
+ constexpr GPRReg numParametersGPR = regT6;
+
+ tagReturnAddress();
+#if CPU(X86_64)
+ push(framePointerRegister);
+#elif CPU(ARM64)
+ pushPair(framePointerRegister, linkRegister);
+#endif
+
+ storePtr(codeBlockGPR, addressFor(CallFrameSlot::codeBlock));
+ store8(TrustedImm32(0), Address(codeBlockGPR, CodeBlock::offsetOfShouldAlwaysBeInlined()));
+
+ load32(payloadFor(CallFrameSlot::argumentCountIncludingThis), regT1);
+ load32(Address(codeBlockGPR, CodeBlock::offsetOfNumParameters()), numParametersGPR);
+ Jump noFixupNeeded = branch32(AboveOrEqual, regT1, numParametersGPR);
+
+ if constexpr (maxFrameExtentForSlowPathCall)
+ addPtr(TrustedImm32(-static_cast<int32_t>(maxFrameExtentForSlowPathCall)), stackPointerRegister);
+
+ loadPtr(Address(codeBlockGPR, CodeBlock::offsetOfGlobalObject()), argumentGPR0);
+
+ static_assert(std::is_same<decltype(operationConstructArityCheck), decltype(operationCallArityCheck)>::value);
+ setupArguments<decltype(operationCallArityCheck)>(argumentGPR0);
+ prepareCallOperation(vm);
+
+ MacroAssembler::Call arityCheckCall = call(OperationPtrTag);
+ Jump handleExceptionJump = emitNonPatchableExceptionCheck(vm);
+
+ if constexpr (maxFrameExtentForSlowPathCall)
+ addPtr(TrustedImm32(maxFrameExtentForSlowPathCall), stackPointerRegister);
+ Jump needFixup = branchTest32(NonZero, returnValueGPR);
+ noFixupNeeded.link(this);
+
+ // The normal prologue expects incoming codeBlockGPR.
+ load64(addressFor(CallFrameSlot::codeBlock), codeBlockGPR);
+
+#if CPU(X86_64)
+ pop(framePointerRegister);
+#elif CPU(ARM64)
+ popPair(framePointerRegister, linkRegister);
+#endif
+ untagReturnAddress();
+
+ JumpList normalPrologueJump;
+ normalPrologueJump.append(jump());
+
+ needFixup.link(this);
+
+ // Restore the stack for arity fixup, and preserve the return address.
+ // arityFixupGenerator will be shifting the stack. So, we can't use the stack to
+ // preserve the return address. We also can't use callee saved registers because
+ // they haven't been saved yet.
+ //
+ // arityFixupGenerator is carefully crafted to only use a0, a1, a2, t3, t4 and t5.
+ // So, the return address can be preserved in regT7.
+#if CPU(X86_64)
+ pop(argumentGPR2); // discard.
+ pop(regT7); // save return address.
+#elif CPU(ARM64)
+ popPair(framePointerRegister, linkRegister);
+ untagReturnAddress();
+ move(linkRegister, regT7);
+ auto randomReturnAddressTag = random();
+ move(TrustedImm32(randomReturnAddressTag), regT1);
+ tagPtr(regT1, regT7);
+#endif
+ move(returnValueGPR, GPRInfo::argumentGPR0);
+ Call arityFixupCall = nearCall();
+
+#if CPU(X86_64)
+ push(regT7); // restore return address.
+#elif CPU(ARM64)
+ move(TrustedImm32(randomReturnAddressTag), regT1);
+ untagPtr(regT1, regT7);
+ move(regT7, linkRegister);
+#endif
+
+ load64(addressFor(CallFrameSlot::codeBlock), codeBlockGPR);
+ normalPrologueJump.append(jump());
+
+ auto arityCheckOperation = isConstructor ? operationConstructArityCheck : operationCallArityCheck;
+ auto arityFixup = vm.getCTIStub(arityFixupGenerator);
+ auto normalPrologue = vm.getCTIStub(normalPrologueGenerator);
+ auto exceptionHandler = vm.getCTIStub(popThunkStackPreservesAndHandleExceptionGenerator);
+
+ LinkBuffer patchBuffer(*this, GLOBAL_THUNK_ID, LinkBuffer::Profile::ExtraCTIThunk);
+ patchBuffer.link(arityCheckCall, FunctionPtr<OperationPtrTag>(arityCheckOperation));
+ patchBuffer.link(arityFixupCall, FunctionPtr(arityFixup.retaggedCode<NoPtrTag>()));
+ patchBuffer.link(normalPrologueJump, CodeLocationLabel(normalPrologue.retaggedCode<NoPtrTag>()));
+ patchBuffer.link(handleExceptionJump, CodeLocationLabel(exceptionHandler.retaggedCode<NoPtrTag>()));
+ return FINALIZE_CODE(patchBuffer, JITThunkPtrTag, thunkName);
+}
+
+#define DEFINE_ARITY_PROGLOGUE_GENERATOR(doesProfiling, isConstructor, hasHugeFrame, name, arityFixupName) \
+MacroAssemblerCodeRef<JITThunkPtrTag> JIT::arityFixupName(VM& vm) \
+ { \
+ JIT jit(vm); \
+ return jit.arityFixupPrologueGenerator(vm, isConstructor, name, "Baseline: " #arityFixupName); \
+ }
+
+FOR_EACH_PROLOGUE_GENERATOR(DEFINE_ARITY_PROGLOGUE_GENERATOR)
+#undef DEFINE_ARITY_PROGLOGUE_GENERATOR
+
+#endif // ENABLE(EXTRA_CTI_THUNKS)
+
void JIT::link()
{
LinkBuffer& patchBuffer = *m_linkBuffer;
@@ -1046,9 +1377,9 @@
return finalizeOnMainThread();
}
+#if !ENABLE(EXTRA_CTI_THUNKS)
void JIT::privateCompileExceptionHandlers()
{
-#if !ENABLE(EXTRA_CTI_THUNKS)
if (!m_exceptionChecksWithCallFrameRollback.empty()) {
m_exceptionChecksWithCallFrameRollback.link(this);
@@ -1073,8 +1404,8 @@
m_farCalls.append(FarCallRecord(call(OperationPtrTag), FunctionPtr<OperationPtrTag>(operationLookupExceptionHandler)));
jumpToExceptionHandler(vm());
}
-#endif // ENABLE(EXTRA_CTI_THUNKS)
}
+#endif // !ENABLE(EXTRA_CTI_THUNKS)
void JIT::doMainThreadPreparationBeforeCompile()
{
Modified: trunk/Source/_javascript_Core/jit/JIT.h (278575 => 278576)
--- trunk/Source/_javascript_Core/jit/JIT.h 2021-06-07 22:11:29 UTC (rev 278575)
+++ trunk/Source/_javascript_Core/jit/JIT.h 2021-06-07 22:51:25 UTC (rev 278576)
@@ -318,7 +318,9 @@
m_exceptionChecksWithCallFrameRollback.append(emitExceptionCheck(vm()));
}
+#if !ENABLE(EXTRA_CTI_THUNKS)
void privateCompileExceptionHandlers();
+#endif
void advanceToNextCheckpoint();
void emitJumpSlowToHotForCheckpoint(Jump);
@@ -790,6 +792,26 @@
#if ENABLE(EXTRA_CTI_THUNKS)
// Thunk generators.
+ static MacroAssemblerCodeRef<JITThunkPtrTag> prologueGenerator0(VM&);
+ static MacroAssemblerCodeRef<JITThunkPtrTag> prologueGenerator1(VM&);
+ static MacroAssemblerCodeRef<JITThunkPtrTag> prologueGenerator2(VM&);
+ static MacroAssemblerCodeRef<JITThunkPtrTag> prologueGenerator3(VM&);
+ static MacroAssemblerCodeRef<JITThunkPtrTag> prologueGenerator4(VM&);
+ static MacroAssemblerCodeRef<JITThunkPtrTag> prologueGenerator5(VM&);
+ static MacroAssemblerCodeRef<JITThunkPtrTag> prologueGenerator6(VM&);
+ static MacroAssemblerCodeRef<JITThunkPtrTag> prologueGenerator7(VM&);
+ MacroAssemblerCodeRef<JITThunkPtrTag> prologueGenerator(VM&, bool doesProfiling, bool isConstructor, bool hasHugeFrame, const char* name);
+
+ static MacroAssemblerCodeRef<JITThunkPtrTag> arityFixup_prologueGenerator0(VM&);
+ static MacroAssemblerCodeRef<JITThunkPtrTag> arityFixup_prologueGenerator1(VM&);
+ static MacroAssemblerCodeRef<JITThunkPtrTag> arityFixup_prologueGenerator2(VM&);
+ static MacroAssemblerCodeRef<JITThunkPtrTag> arityFixup_prologueGenerator3(VM&);
+ static MacroAssemblerCodeRef<JITThunkPtrTag> arityFixup_prologueGenerator4(VM&);
+ static MacroAssemblerCodeRef<JITThunkPtrTag> arityFixup_prologueGenerator5(VM&);
+ static MacroAssemblerCodeRef<JITThunkPtrTag> arityFixup_prologueGenerator6(VM&);
+ static MacroAssemblerCodeRef<JITThunkPtrTag> arityFixup_prologueGenerator7(VM&);
+ MacroAssemblerCodeRef<JITThunkPtrTag> arityFixupPrologueGenerator(VM&, bool isConstructor, ThunkGenerator normalPrologueGenerator, const char* name);
+
static MacroAssemblerCodeRef<JITThunkPtrTag> slow_op_del_by_id_prepareCallGenerator(VM&);
static MacroAssemblerCodeRef<JITThunkPtrTag> slow_op_del_by_val_prepareCallGenerator(VM&);
static MacroAssemblerCodeRef<JITThunkPtrTag> slow_op_get_by_id_prepareCallGenerator(VM&);
@@ -804,7 +826,14 @@
static MacroAssemblerCodeRef<JITThunkPtrTag> slow_op_resolve_scopeGenerator(VM&);
static MacroAssemblerCodeRef<JITThunkPtrTag> op_check_traps_handlerGenerator(VM&);
- static MacroAssemblerCodeRef<JITThunkPtrTag> op_enter_handlerGenerator(VM&);
+
+ static MacroAssemblerCodeRef<JITThunkPtrTag> op_enter_canBeOptimized_Generator(VM&);
+ static MacroAssemblerCodeRef<JITThunkPtrTag> op_enter_cannotBeOptimized_Generator(VM&);
+ MacroAssemblerCodeRef<JITThunkPtrTag> op_enter_Generator(VM&, bool canBeOptimized, const char* thunkName);
+
+#if ENABLE(DFG_JIT)
+ static MacroAssemblerCodeRef<JITThunkPtrTag> op_loop_hint_Generator(VM&);
+#endif
static MacroAssemblerCodeRef<JITThunkPtrTag> op_ret_handlerGenerator(VM&);
static MacroAssemblerCodeRef<JITThunkPtrTag> op_throw_handlerGenerator(VM&);
Modified: trunk/Source/_javascript_Core/jit/JITInlines.h (278575 => 278576)
--- trunk/Source/_javascript_Core/jit/JITInlines.h 2021-06-07 22:11:29 UTC (rev 278575)
+++ trunk/Source/_javascript_Core/jit/JITInlines.h 2021-06-07 22:51:25 UTC (rev 278576)
@@ -91,7 +91,6 @@
ALWAYS_INLINE JIT::Call JIT::emitNakedNearCall(CodePtr<NoPtrTag> target)
{
- ASSERT(m_bytecodeIndex); // This method should only be called during hot/cold path generation, so that m_bytecodeIndex is set.
Call nakedCall = nearCall();
m_nearCalls.append(NearCallRecord(nakedCall, FunctionPtr<JSInternalPtrTag>(target.retagged<JSInternalPtrTag>())));
return nakedCall;
Modified: trunk/Source/_javascript_Core/jit/JITOpcodes.cpp (278575 => 278576)
--- trunk/Source/_javascript_Core/jit/JITOpcodes.cpp 2021-06-07 22:11:29 UTC (rev 278575)
+++ trunk/Source/_javascript_Core/jit/JITOpcodes.cpp 2021-06-07 22:51:25 UTC (rev 278576)
@@ -376,7 +376,7 @@
JIT jit(vm);
jit.checkStackPointerAlignment();
- jit.emitRestoreCalleeSavesFor(&RegisterAtOffsetList::llintBaselineCalleeSaveRegisters());
+ jit.emitRestoreCalleeSavesForBaselineJIT();
jit.emitFunctionEpilogue();
jit.ret();
@@ -1186,104 +1186,116 @@
emitEnterOptimizationCheck();
#else
ASSERT(m_bytecodeIndex.offset() == 0);
- constexpr GPRReg localsToInitGPR = argumentGPR0;
- constexpr GPRReg canBeOptimizedGPR = argumentGPR4;
-
unsigned localsToInit = count - CodeBlock::llintBaselineCalleeSaveSpaceAsVirtualRegisters();
RELEASE_ASSERT(localsToInit < count);
- move(TrustedImm32(localsToInit * sizeof(Register)), localsToInitGPR);
- move(TrustedImm32(canBeOptimized()), canBeOptimizedGPR);
- emitNakedNearCall(vm().getCTIStub(op_enter_handlerGenerator).retaggedCode<NoPtrTag>());
+ ThunkGenerator generator = canBeOptimized() ? op_enter_canBeOptimized_Generator : op_enter_cannotBeOptimized_Generator;
+ emitNakedNearCall(vm().getCTIStub(generator).retaggedCode<NoPtrTag>());
#endif // ENABLE(EXTRA_CTI_THUNKS)
}
#if ENABLE(EXTRA_CTI_THUNKS)
-MacroAssemblerCodeRef<JITThunkPtrTag> JIT::op_enter_handlerGenerator(VM& vm)
+MacroAssemblerCodeRef<JITThunkPtrTag> JIT::op_enter_Generator(VM& vm, bool canBeOptimized, const char* thunkName)
{
- JIT jit(vm);
-
#if CPU(X86_64)
- jit.push(X86Registers::ebp);
+ push(X86Registers::ebp);
#elif CPU(ARM64)
- jit.tagReturnAddress();
- jit.pushPair(framePointerRegister, linkRegister);
+ tagReturnAddress();
+ pushPair(framePointerRegister, linkRegister);
#endif
// op_enter is always at bytecodeOffset 0.
- jit.store32(TrustedImm32(0), tagFor(CallFrameSlot::argumentCountIncludingThis));
+ store32(TrustedImm32(0), tagFor(CallFrameSlot::argumentCountIncludingThis));
constexpr GPRReg localsToInitGPR = argumentGPR0;
constexpr GPRReg iteratorGPR = argumentGPR1;
constexpr GPRReg endGPR = argumentGPR2;
constexpr GPRReg undefinedGPR = argumentGPR3;
- constexpr GPRReg canBeOptimizedGPR = argumentGPR4;
+ constexpr GPRReg codeBlockGPR = argumentGPR4;
+ constexpr int virtualRegisterSizeShift = 3;
+ static_assert((1 << virtualRegisterSizeShift) == sizeof(Register));
+
+ loadPtr(addressFor(CallFrameSlot::codeBlock), codeBlockGPR);
+ load32(Address(codeBlockGPR, CodeBlock::offsetOfNumVars()), localsToInitGPR);
+ sub32(TrustedImm32(CodeBlock::llintBaselineCalleeSaveSpaceAsVirtualRegisters()), localsToInitGPR);
+ lshift32(TrustedImm32(virtualRegisterSizeShift), localsToInitGPR);
+
size_t startLocal = CodeBlock::llintBaselineCalleeSaveSpaceAsVirtualRegisters();
int startOffset = virtualRegisterForLocal(startLocal).offset();
- jit.move(TrustedImm64(startOffset * sizeof(Register)), iteratorGPR);
- jit.sub64(iteratorGPR, localsToInitGPR, endGPR);
+ move(TrustedImm64(startOffset * sizeof(Register)), iteratorGPR);
+ sub64(iteratorGPR, localsToInitGPR, endGPR);
- jit.move(TrustedImm64(JSValue::encode(jsUndefined())), undefinedGPR);
- auto initLoop = jit.label();
- Jump initDone = jit.branch32(LessThanOrEqual, iteratorGPR, endGPR);
+ move(TrustedImm64(JSValue::encode(jsUndefined())), undefinedGPR);
+ auto initLoop = label();
+ Jump initDone = branch32(LessThanOrEqual, iteratorGPR, endGPR);
{
- jit.store64(undefinedGPR, BaseIndex(GPRInfo::callFrameRegister, iteratorGPR, TimesOne));
- jit.sub64(TrustedImm32(sizeof(Register)), iteratorGPR);
- jit.jump(initLoop);
+ store64(undefinedGPR, BaseIndex(GPRInfo::callFrameRegister, iteratorGPR, TimesOne));
+ sub64(TrustedImm32(sizeof(Register)), iteratorGPR);
+ jump(initLoop);
}
- initDone.link(&jit);
+ initDone.link(this);
- // emitWriteBarrier(m_codeBlock).
- jit.loadPtr(addressFor(CallFrameSlot::codeBlock), argumentGPR1);
- Jump ownerIsRememberedOrInEden = jit.barrierBranch(vm, argumentGPR1, argumentGPR2);
+ // Implementing emitWriteBarrier(m_codeBlock).
+ Jump ownerIsRememberedOrInEden = barrierBranch(vm, codeBlockGPR, argumentGPR2);
- jit.move(canBeOptimizedGPR, GPRInfo::numberTagRegister); // save.
- jit.setupArguments<decltype(operationWriteBarrierSlowPath)>(&vm, argumentGPR1);
- jit.prepareCallOperation(vm);
- Call operationWriteBarrierCall = jit.call(OperationPtrTag);
+ setupArguments<decltype(operationWriteBarrierSlowPath)>(&vm, codeBlockGPR);
+ prepareCallOperation(vm);
+ Call operationWriteBarrierCall = call(OperationPtrTag);
- jit.move(GPRInfo::numberTagRegister, canBeOptimizedGPR); // restore.
- jit.move(TrustedImm64(JSValue::NumberTag), GPRInfo::numberTagRegister);
- ownerIsRememberedOrInEden.link(&jit);
+ if (canBeOptimized)
+ loadPtr(addressFor(CallFrameSlot::codeBlock), codeBlockGPR);
+ ownerIsRememberedOrInEden.link(this);
+
#if ENABLE(DFG_JIT)
+ // Implementing emitEnterOptimizationCheck().
Call operationOptimizeCall;
- if (Options::useDFGJIT()) {
- // emitEnterOptimizationCheck().
+ if (canBeOptimized) {
JumpList skipOptimize;
- skipOptimize.append(jit.branchTest32(Zero, canBeOptimizedGPR));
+ skipOptimize.append(branchAdd32(Signed, TrustedImm32(Options::executionCounterIncrementForEntry()), Address(codeBlockGPR, CodeBlock::offsetOfJITExecuteCounter())));
- jit.loadPtr(addressFor(CallFrameSlot::codeBlock), argumentGPR1);
- skipOptimize.append(jit.branchAdd32(Signed, TrustedImm32(Options::executionCounterIncrementForEntry()), Address(argumentGPR1, CodeBlock::offsetOfJITExecuteCounter())));
+ copyLLIntBaselineCalleeSavesFromFrameOrRegisterToEntryFrameCalleeSavesBuffer(vm.topEntryFrame);
- jit.copyLLIntBaselineCalleeSavesFromFrameOrRegisterToEntryFrameCalleeSavesBuffer(vm.topEntryFrame);
+ setupArguments<decltype(operationOptimize)>(&vm, TrustedImm32(0));
+ prepareCallOperation(vm);
+ operationOptimizeCall = call(OperationPtrTag);
- jit.setupArguments<decltype(operationOptimize)>(&vm, TrustedImm32(0));
- jit.prepareCallOperation(vm);
- operationOptimizeCall = jit.call(OperationPtrTag);
+ skipOptimize.append(branchTestPtr(Zero, returnValueGPR));
+ farJump(returnValueGPR, GPRInfo::callFrameRegister);
- skipOptimize.append(jit.branchTestPtr(Zero, returnValueGPR));
- jit.farJump(returnValueGPR, GPRInfo::callFrameRegister);
-
- skipOptimize.link(&jit);
+ skipOptimize.link(this);
}
#endif // ENABLE(DFG_JIT)
#if CPU(X86_64)
- jit.pop(X86Registers::ebp);
+ pop(X86Registers::ebp);
#elif CPU(ARM64)
- jit.popPair(framePointerRegister, linkRegister);
+ popPair(framePointerRegister, linkRegister);
#endif
- jit.ret();
+ ret();
- LinkBuffer patchBuffer(jit, GLOBAL_THUNK_ID, LinkBuffer::Profile::ExtraCTIThunk);
+ LinkBuffer patchBuffer(*this, GLOBAL_THUNK_ID, LinkBuffer::Profile::ExtraCTIThunk);
patchBuffer.link(operationWriteBarrierCall, FunctionPtr<OperationPtrTag>(operationWriteBarrierSlowPath));
#if ENABLE(DFG_JIT)
- if (Options::useDFGJIT())
+ if (canBeOptimized)
patchBuffer.link(operationOptimizeCall, FunctionPtr<OperationPtrTag>(operationOptimize));
#endif
- return FINALIZE_CODE(patchBuffer, JITThunkPtrTag, "Baseline: op_enter_handler");
+ return FINALIZE_CODE(patchBuffer, JITThunkPtrTag, thunkName);
}
+
+MacroAssemblerCodeRef<JITThunkPtrTag> JIT::op_enter_canBeOptimized_Generator(VM& vm)
+{
+ JIT jit(vm);
+ constexpr bool canBeOptimized = true;
+ return jit.op_enter_Generator(vm, canBeOptimized, "Baseline: op_enter_canBeOptimized");
+}
+
+MacroAssemblerCodeRef<JITThunkPtrTag> JIT::op_enter_cannotBeOptimized_Generator(VM& vm)
+{
+ JIT jit(vm);
+ constexpr bool canBeOptimized = false;
+ return jit.op_enter_Generator(vm, canBeOptimized, "Baseline: op_enter_cannotBeOptimized");
+}
#endif // ENABLE(EXTRA_CTI_THUNKS)
void JIT::emit_op_get_scope(const Instruction* currentInstruction)
@@ -1434,16 +1446,20 @@
add64(TrustedImm32(1), regT0);
store64(regT0, ptr);
}
+#else
+ UNUSED_PARAM(instruction);
#endif
- // Emit the JIT optimization check:
+ // Emit the JIT optimization check:
if (canBeOptimized()) {
+ constexpr GPRReg codeBlockGPR = regT0;
+ loadPtr(addressFor(CallFrameSlot::codeBlock), codeBlockGPR);
addSlowCase(branchAdd32(PositiveOrZero, TrustedImm32(Options::executionCounterIncrementForLoop()),
- AbsoluteAddress(m_codeBlock->addressOfJITExecuteCounter())));
+ Address(codeBlockGPR, CodeBlock::offsetOfJITExecuteCounter())));
}
}
-void JIT::emitSlow_op_loop_hint(const Instruction* currentInstruction, Vector<SlowCaseEntry>::iterator& iter)
+void JIT::emitSlow_op_loop_hint(const Instruction* instruction, Vector<SlowCaseEntry>::iterator& iter)
{
#if ENABLE(DFG_JIT)
// Emit the slow path for the JIT optimization check:
@@ -1450,6 +1466,7 @@
if (canBeOptimized()) {
linkAllSlowCases(iter);
+#if !ENABLE(EXTRA_CTI_THUNKS)
copyLLIntBaselineCalleeSavesFromFrameOrRegisterToEntryFrameCalleeSavesBuffer(vm().topEntryFrame);
callOperationNoExceptionCheck(operationOptimize, &vm(), m_bytecodeIndex.asBits());
@@ -1462,13 +1479,81 @@
farJump(returnValueGPR, GPRInfo::callFrameRegister);
noOptimizedEntry.link(this);
- emitJumpSlowToHot(jump(), currentInstruction->size());
+#else // ENABLE(EXTRA_CTI_THUNKS)
+ uint32_t bytecodeOffset = m_bytecodeIndex.offset();
+ ASSERT(BytecodeIndex(bytecodeOffset) == m_bytecodeIndex);
+ ASSERT(m_codeBlock->instructionAt(m_bytecodeIndex) == instruction);
+
+ constexpr GPRReg bytecodeOffsetGPR = regT7;
+
+ move(TrustedImm32(bytecodeOffset), bytecodeOffsetGPR);
+ emitNakedNearCall(vm().getCTIStub(op_loop_hint_Generator).retaggedCode<NoPtrTag>());
+#endif // !ENABLE(EXTRA_CTI_THUNKS)
}
-#else
- UNUSED_PARAM(currentInstruction);
+#endif // ENABLE(DFG_JIT)
UNUSED_PARAM(iter);
+ UNUSED_PARAM(instruction);
+}
+
+#if ENABLE(EXTRA_CTI_THUNKS)
+
+#if ENABLE(DFG_JIT)
+MacroAssemblerCodeRef<JITThunkPtrTag> JIT::op_loop_hint_Generator(VM& vm)
+{
+ // The thunk generated by this function can only work with the LLInt / Baseline JIT because
+ // it makes assumptions about the right globalObject being available from CallFrame::codeBlock().
+ // DFG/FTL may inline functions belonging to other globalObjects, which may not match
+ // CallFrame::codeBlock().
+ JIT jit(vm);
+
+ jit.tagReturnAddress();
+
+ constexpr GPRReg bytecodeOffsetGPR = regT7; // incoming.
+
+#if CPU(X86_64)
+ jit.push(framePointerRegister);
+#elif CPU(ARM64)
+ jit.pushPair(framePointerRegister, linkRegister);
#endif
+
+ auto usedRegisters = RegisterSet::stubUnavailableRegisters();
+ usedRegisters.add(bytecodeOffsetGPR);
+ jit.copyLLIntBaselineCalleeSavesFromFrameOrRegisterToEntryFrameCalleeSavesBuffer(vm.topEntryFrame, usedRegisters);
+
+ jit.store32(bytecodeOffsetGPR, CCallHelpers::tagFor(CallFrameSlot::argumentCountIncludingThis));
+ jit.lshift32(TrustedImm32(BytecodeIndex::checkpointShift), bytecodeOffsetGPR);
+ jit.setupArguments<decltype(operationOptimize)>(TrustedImmPtr(&vm), bytecodeOffsetGPR);
+ jit.prepareCallOperation(vm);
+ Call operationCall = jit.call(OperationPtrTag);
+ Jump hasOptimizedEntry = jit.branchTestPtr(NonZero, returnValueGPR);
+
+#if CPU(X86_64)
+ jit.pop(framePointerRegister);
+#elif CPU(ARM64)
+ jit.popPair(framePointerRegister, linkRegister);
+#endif
+ jit.ret();
+
+ hasOptimizedEntry.link(&jit);
+#if CPU(X86_64)
+ jit.addPtr(CCallHelpers::TrustedImm32(2 * sizeof(CPURegister)), stackPointerRegister);
+#elif CPU(ARM64)
+ jit.popPair(framePointerRegister, linkRegister);
+#endif
+ if (ASSERT_ENABLED) {
+ Jump ok = jit.branchPtr(MacroAssembler::Above, returnValueGPR, TrustedImmPtr(bitwise_cast<void*>(static_cast<intptr_t>(1000))));
+ jit.abortWithReason(JITUnreasonableLoopHintJumpTarget);
+ ok.link(&jit);
+ }
+
+ jit.farJump(returnValueGPR, GPRInfo::callFrameRegister);
+
+ LinkBuffer patchBuffer(jit, GLOBAL_THUNK_ID, LinkBuffer::Profile::ExtraCTIThunk);
+ patchBuffer.link(operationCall, FunctionPtr<OperationPtrTag>(operationOptimize));
+ return FINALIZE_CODE(patchBuffer, JITThunkPtrTag, "Baseline: op_loop_hint");
}
+#endif // ENABLE(DFG_JIT)
+#endif // !ENABLE(EXTRA_CTI_THUNKS)
void JIT::emit_op_check_traps(const Instruction*)
{
Modified: trunk/Source/_javascript_Core/jit/JITOpcodes32_64.cpp (278575 => 278576)
--- trunk/Source/_javascript_Core/jit/JITOpcodes32_64.cpp 2021-06-07 22:11:29 UTC (rev 278575)
+++ trunk/Source/_javascript_Core/jit/JITOpcodes32_64.cpp 2021-06-07 22:51:25 UTC (rev 278576)
@@ -1066,7 +1066,7 @@
// Even though JIT code doesn't use them, we initialize our constant
// registers to zap stale pointers, to avoid unnecessarily prolonging
// object lifetime and increasing GC pressure.
- for (int i = CodeBlock::llintBaselineCalleeSaveSpaceAsVirtualRegisters(); i < m_codeBlock->numVars(); ++i)
+ for (unsigned i = CodeBlock::llintBaselineCalleeSaveSpaceAsVirtualRegisters(); i < m_codeBlock->numVars(); ++i)
emitStore(virtualRegisterForLocal(i), jsUndefined());
JITSlowPathCall slowPathCall(this, currentInstruction, slow_path_enter);
Modified: trunk/Source/_javascript_Core/jit/ThunkGenerators.cpp (278575 => 278576)
--- trunk/Source/_javascript_Core/jit/ThunkGenerators.cpp 2021-06-07 22:11:29 UTC (rev 278575)
+++ trunk/Source/_javascript_Core/jit/ThunkGenerators.cpp 2021-06-07 22:51:25 UTC (rev 278576)
@@ -81,11 +81,7 @@
{
CCallHelpers jit;
-#if CPU(X86_64)
- jit.addPtr(CCallHelpers::TrustedImm32(2 * sizeof(CPURegister)), X86Registers::esp);
-#elif CPU(ARM64)
- jit.popPair(CCallHelpers::framePointerRegister, CCallHelpers::linkRegister);
-#endif
+ jit.addPtr(CCallHelpers::TrustedImm32(2 * sizeof(CPURegister)), CCallHelpers::stackPointerRegister);
CCallHelpers::Jump continuation = jit.jump();