- Revision
- 203365
- Author
- [email protected]
- Date
- 2016-07-18 12:51:45 -0700 (Mon, 18 Jul 2016)
Log Message
RegisterSet should use a Bitmap instead of a BitVector so that it never allocates memory and is trivial to copy
https://bugs.webkit.org/show_bug.cgi?id=159863
Reviewed by Saam Barati.
Source/_javascript_Core:
Switch RegisterSet set to Bitmap because Bitmap doesn't ever allocate memory and can be
assigned by memcpy. This should be a performance improvement for compiler code that does a
lot of things with RegisterSet. For example, it's one of the fundamental data structures in
Air. The previous use of BitVector meant that almost every operation on RegisterSet would
have a slow path call. On ARM64, it would mean memory allocation for any RegisterSet that
used all available registers.
This meant adding even more GPR/FPR reflection to the MacroAssembler API: we now have consts
called numGPRs and numFPRs. This is necessary to statically size the Bitmap in RegisterSet.
Here's the breakdown of sizes of RegisterSet on different CPUs:
x86-32: 8 bits (GPRs) + 8 bits (FPRs) + 1 bit (is deleted) = 1x uint32_t.
x86-64: 16 bits + 16 bits + 1 bit = 2x uint32_t.
ARMv7: 16 bits + 16 bits + 1 bit = 2x uint32_t.
ARM64: 32 bits + 32 bits + 1 bit = 3x uint32_t.
* assembler/MacroAssemblerARM.h:
* assembler/MacroAssemblerARM64.h:
* assembler/MacroAssemblerARMv7.h:
* assembler/MacroAssemblerX86.h:
* assembler/MacroAssemblerX86Common.h:
(JSC::MacroAssemblerX86Common::scratchRegister):
* assembler/MacroAssemblerX86_64.h:
* jit/RegisterSet.h:
(JSC::RegisterSet::set):
(JSC::RegisterSet::get):
(JSC::RegisterSet::setAll):
(JSC::RegisterSet::merge):
(JSC::RegisterSet::filter):
(JSC::RegisterSet::exclude):
(JSC::RegisterSet::numberOfSetRegisters):
(JSC::RegisterSet::RegisterSet):
(JSC::RegisterSet::isEmptyValue):
(JSC::RegisterSet::isDeletedValue):
(JSC::RegisterSet::operator==):
(JSC::RegisterSet::operator!=):
(JSC::RegisterSet::hash):
(JSC::RegisterSet::forEach):
(JSC::RegisterSet::setMany):
Source/WTF:
Give Bitmap all of the power of BitVector (except for automatic resizing). This means a
variant of set() that takes a bool, and a bunch of helper methods (merge, filter, exclude,
forEachSetBit, ==, !=, and hash).
* wtf/Bitmap.h:
(WTF::WordType>::set):
(WTF::WordType>::testAndSet):
(WTF::WordType>::isFull):
(WTF::WordType>::merge):
(WTF::WordType>::filter):
(WTF::WordType>::exclude):
(WTF::WordType>::forEachSetBit):
(WTF::=):
(WTF::WordType>::hash):
Modified Paths
Diff
Modified: trunk/Source/_javascript_Core/ChangeLog (203364 => 203365)
--- trunk/Source/_javascript_Core/ChangeLog 2016-07-18 19:32:34 UTC (rev 203364)
+++ trunk/Source/_javascript_Core/ChangeLog 2016-07-18 19:51:45 UTC (rev 203365)
@@ -1,3 +1,51 @@
+2016-07-17 Filip Pizlo <[email protected]>
+
+ RegisterSet should use a Bitmap instead of a BitVector so that it never allocates memory and is trivial to copy
+ https://bugs.webkit.org/show_bug.cgi?id=159863
+
+ Reviewed by Saam Barati.
+
+ Switch RegisterSet set to Bitmap because Bitmap doesn't ever allocate memory and can be
+ assigned by memcpy. This should be a performance improvement for compiler code that does a
+ lot of things with RegisterSet. For example, it's one of the fundamental data structures in
+ Air. The previous use of BitVector meant that almost every operation on RegisterSet would
+ have a slow path call. On ARM64, it would mean memory allocation for any RegisterSet that
+ used all available registers.
+
+ This meant adding even more GPR/FPR reflection to the MacroAssembler API: we now have consts
+ called numGPRs and numFPRs. This is necessary to statically size the Bitmap in RegisterSet.
+
+ Here's the breakdown of sizes of RegisterSet on different CPUs:
+
+ x86-32: 8 bits (GPRs) + 8 bits (FPRs) + 1 bit (is deleted) = 1x uint32_t.
+ x86-64: 16 bits + 16 bits + 1 bit = 2x uint32_t.
+ ARMv7: 16 bits + 16 bits + 1 bit = 2x uint32_t.
+ ARM64: 32 bits + 32 bits + 1 bit = 3x uint32_t.
+
+ * assembler/MacroAssemblerARM.h:
+ * assembler/MacroAssemblerARM64.h:
+ * assembler/MacroAssemblerARMv7.h:
+ * assembler/MacroAssemblerX86.h:
+ * assembler/MacroAssemblerX86Common.h:
+ (JSC::MacroAssemblerX86Common::scratchRegister):
+ * assembler/MacroAssemblerX86_64.h:
+ * jit/RegisterSet.h:
+ (JSC::RegisterSet::set):
+ (JSC::RegisterSet::get):
+ (JSC::RegisterSet::setAll):
+ (JSC::RegisterSet::merge):
+ (JSC::RegisterSet::filter):
+ (JSC::RegisterSet::exclude):
+ (JSC::RegisterSet::numberOfSetRegisters):
+ (JSC::RegisterSet::RegisterSet):
+ (JSC::RegisterSet::isEmptyValue):
+ (JSC::RegisterSet::isDeletedValue):
+ (JSC::RegisterSet::operator==):
+ (JSC::RegisterSet::operator!=):
+ (JSC::RegisterSet::hash):
+ (JSC::RegisterSet::forEach):
+ (JSC::RegisterSet::setMany):
+
2016-07-15 Filip Pizlo <[email protected]>
DFG and FTL should support op_call_eval
Modified: trunk/Source/_javascript_Core/assembler/MacroAssemblerARM.h (203364 => 203365)
--- trunk/Source/_javascript_Core/assembler/MacroAssemblerARM.h 2016-07-18 19:32:34 UTC (rev 203364)
+++ trunk/Source/_javascript_Core/assembler/MacroAssemblerARM.h 2016-07-18 19:51:45 UTC (rev 203365)
@@ -40,6 +40,9 @@
static const int DoubleConditionBitSpecial = 0x10;
COMPILE_ASSERT(!(DoubleConditionBitSpecial & DoubleConditionMask), DoubleConditionBitSpecial_should_not_interfere_with_ARMAssembler_Condition_codes);
public:
+ static const unsigned numGPRs = 16;
+ static const unsigned numFPRs = 16;
+
typedef ARMRegisters::FPRegisterID FPRegisterID;
enum RelationalCondition {
Modified: trunk/Source/_javascript_Core/assembler/MacroAssemblerARM64.h (203364 => 203365)
--- trunk/Source/_javascript_Core/assembler/MacroAssemblerARM64.h 2016-07-18 19:32:34 UTC (rev 203364)
+++ trunk/Source/_javascript_Core/assembler/MacroAssemblerARM64.h 2016-07-18 19:51:45 UTC (rev 203365)
@@ -37,6 +37,9 @@
class MacroAssemblerARM64 : public AbstractMacroAssembler<ARM64Assembler, MacroAssemblerARM64> {
public:
+ static const unsigned numGPRs = 32;
+ static const unsigned numFPRs = 32;
+
static const RegisterID dataTempRegister = ARM64Registers::ip0;
static const RegisterID memoryTempRegister = ARM64Registers::ip1;
Modified: trunk/Source/_javascript_Core/assembler/MacroAssemblerARMv7.h (203364 => 203365)
--- trunk/Source/_javascript_Core/assembler/MacroAssemblerARMv7.h 2016-07-18 19:32:34 UTC (rev 203364)
+++ trunk/Source/_javascript_Core/assembler/MacroAssemblerARMv7.h 2016-07-18 19:51:45 UTC (rev 203365)
@@ -42,6 +42,9 @@
inline ARMRegisters::FPSingleRegisterID fpTempRegisterAsSingle() { return ARMRegisters::asSingle(fpTempRegister); }
public:
+ static const unsigned numGPRs = 16;
+ static const unsigned numFPRs = 16;
+
MacroAssemblerARMv7()
: m_makeJumpPatchable(false)
{
Modified: trunk/Source/_javascript_Core/assembler/MacroAssemblerX86.h (203364 => 203365)
--- trunk/Source/_javascript_Core/assembler/MacroAssemblerX86.h 2016-07-18 19:32:34 UTC (rev 203364)
+++ trunk/Source/_javascript_Core/assembler/MacroAssemblerX86.h 2016-07-18 19:51:45 UTC (rev 203365)
@@ -34,6 +34,9 @@
class MacroAssemblerX86 : public MacroAssemblerX86Common {
public:
+ static const unsigned numGPRs = 8;
+ static const unsigned numFPRs = 8;
+
static const Scale ScalePtr = TimesFour;
using MacroAssemblerX86Common::add32;
Modified: trunk/Source/_javascript_Core/assembler/MacroAssemblerX86Common.h (203364 => 203365)
--- trunk/Source/_javascript_Core/assembler/MacroAssemblerX86Common.h 2016-07-18 19:32:34 UTC (rev 203364)
+++ trunk/Source/_javascript_Core/assembler/MacroAssemblerX86Common.h 2016-07-18 19:51:45 UTC (rev 203365)
@@ -48,7 +48,7 @@
return s_scratchRegister;
}
#endif
-
+
protected:
static const int DoubleConditionBitInvert = 0x10;
static const int DoubleConditionBitSpecial = 0x20;
Modified: trunk/Source/_javascript_Core/assembler/MacroAssemblerX86_64.h (203364 => 203365)
--- trunk/Source/_javascript_Core/assembler/MacroAssemblerX86_64.h 2016-07-18 19:32:34 UTC (rev 203364)
+++ trunk/Source/_javascript_Core/assembler/MacroAssemblerX86_64.h 2016-07-18 19:51:45 UTC (rev 203365)
@@ -38,6 +38,9 @@
class MacroAssemblerX86_64 : public MacroAssemblerX86Common {
public:
+ static const unsigned numGPRs = 16;
+ static const unsigned numFPRs = 16;
+
static const Scale ScalePtr = TimesEight;
using MacroAssemblerX86Common::add32;
Modified: trunk/Source/_javascript_Core/jit/RegisterSet.h (203364 => 203365)
--- trunk/Source/_javascript_Core/jit/RegisterSet.h 2016-07-18 19:32:34 UTC (rev 203364)
+++ trunk/Source/_javascript_Core/jit/RegisterSet.h 2016-07-18 19:51:45 UTC (rev 203365)
@@ -33,7 +33,7 @@
#include "MacroAssembler.h"
#include "Reg.h"
#include "TempRegisterSet.h"
-#include <wtf/BitVector.h>
+#include <wtf/Bitmap.h>
namespace JSC {
@@ -71,7 +71,7 @@
void set(Reg reg, bool value = true)
{
ASSERT(!!reg);
- m_vector.set(reg.index(), value);
+ m_bits.set(reg.index(), value);
}
void set(JSValueRegs regs, bool value = true)
@@ -90,9 +90,9 @@
bool get(Reg reg) const
{
ASSERT(!!reg);
- return m_vector.get(reg.index());
+ return m_bits.get(reg.index());
}
-
+
template<typename Iterable>
void setAll(const Iterable& iterable)
{
@@ -100,13 +100,13 @@
set(reg);
}
- void merge(const RegisterSet& other) { m_vector.merge(other.m_vector); }
- void filter(const RegisterSet& other) { m_vector.filter(other.m_vector); }
- void exclude(const RegisterSet& other) { m_vector.exclude(other.m_vector); }
+ void merge(const RegisterSet& other) { m_bits.merge(other.m_bits); }
+ void filter(const RegisterSet& other) { m_bits.filter(other.m_bits); }
+ void exclude(const RegisterSet& other) { m_bits.exclude(other.m_bits); }
size_t numberOfSetGPRs() const;
size_t numberOfSetFPRs() const;
- size_t numberOfSetRegisters() const { return m_vector.bitCount(); }
+ size_t numberOfSetRegisters() const { return m_bits.count(); }
void dump(PrintStream&) const;
@@ -114,28 +114,40 @@
enum DeletedValueTag { DeletedValue };
RegisterSet(EmptyValueTag)
- : m_vector(BitVector::EmptyValue)
{
+ m_bits.set(hashSpecialBitIndex);
}
RegisterSet(DeletedValueTag)
- : m_vector(BitVector::DeletedValue)
{
+ m_bits.set(hashSpecialBitIndex);
+ m_bits.set(deletedBitIndex);
}
- bool isEmptyValue() const { return m_vector.isEmptyValue(); }
- bool isDeletedValue() const { return m_vector.isDeletedValue(); }
+ bool isEmptyValue() const
+ {
+ return m_bits.get(hashSpecialBitIndex) && !m_bits.get(deletedBitIndex);
+ }
- bool operator==(const RegisterSet& other) const { return m_vector == other.m_vector; }
- unsigned hash() const { return m_vector.hash(); }
-
- template<typename Functor>
- void forEach(const Functor& functor) const
+ bool isDeletedValue() const
{
- for (size_t index : m_vector)
- functor(Reg::fromIndex(index));
+ return m_bits.get(hashSpecialBitIndex) && m_bits.get(deletedBitIndex);
}
+ bool operator==(const RegisterSet& other) const { return m_bits == other.m_bits; }
+ bool operator!=(const RegisterSet& other) const { return m_bits != other.m_bits; }
+
+ unsigned hash() const { return m_bits.hash(); }
+
+ template<typename Func>
+ void forEach(const Func& func) const
+ {
+ m_bits.forEachSetBit(
+ [&] (size_t index) {
+ func(Reg::fromIndex(index));
+ });
+ }
+
private:
void setAny(Reg reg) { set(reg); }
void setAny(const RegisterSet& set) { merge(set); }
@@ -147,7 +159,13 @@
setMany(regs...);
}
- BitVector m_vector;
+ // These offsets mirror the logic in Reg.h.
+ static const unsigned gprOffset = 0;
+ static const unsigned fprOffset = gprOffset + MacroAssembler::numGPRs;
+ static const unsigned hashSpecialBitIndex = fprOffset + MacroAssembler::numFPRs;
+ static const unsigned deletedBitIndex = 0;
+
+ Bitmap<MacroAssembler::numGPRs + MacroAssembler::numFPRs + 1> m_bits;
};
struct RegisterSetHash {
Modified: trunk/Source/WTF/ChangeLog (203364 => 203365)
--- trunk/Source/WTF/ChangeLog 2016-07-18 19:32:34 UTC (rev 203364)
+++ trunk/Source/WTF/ChangeLog 2016-07-18 19:51:45 UTC (rev 203365)
@@ -1,3 +1,25 @@
+2016-07-17 Filip Pizlo <[email protected]>
+
+ RegisterSet should use a Bitmap instead of a BitVector so that it never allocates memory and is trivial to copy
+ https://bugs.webkit.org/show_bug.cgi?id=159863
+
+ Reviewed by Saam Barati.
+
+ Give Bitmap all of the power of BitVector (except for automatic resizing). This means a
+ variant of set() that takes a bool, and a bunch of helper methods (merge, filter, exclude,
+ forEachSetBit, ==, !=, and hash).
+
+ * wtf/Bitmap.h:
+ (WTF::WordType>::set):
+ (WTF::WordType>::testAndSet):
+ (WTF::WordType>::isFull):
+ (WTF::WordType>::merge):
+ (WTF::WordType>::filter):
+ (WTF::WordType>::exclude):
+ (WTF::WordType>::forEachSetBit):
+ (WTF::=):
+ (WTF::WordType>::hash):
+
2016-07-02 Filip Pizlo <[email protected]>
WTF::Lock should be fair eventually
Modified: trunk/Source/WTF/wtf/Bitmap.h (203364 => 203365)
--- trunk/Source/WTF/wtf/Bitmap.h 2016-07-18 19:32:34 UTC (rev 203364)
+++ trunk/Source/WTF/wtf/Bitmap.h 2016-07-18 19:51:45 UTC (rev 203365)
@@ -1,5 +1,5 @@
/*
- * Copyright (C) 2010 Apple Inc. All rights reserved.
+ * Copyright (C) 2010, 2016 Apple Inc. All rights reserved.
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
@@ -39,6 +39,8 @@
template<size_t bitmapSize, BitmapAtomicMode atomicMode = BitmapNotAtomic, typename WordType = uint32_t>
class Bitmap {
WTF_MAKE_FAST_ALLOCATED;
+
+ static_assert(sizeof(WordType) <= sizeof(unsigned), "WordType must not be bigger than unsigned");
public:
Bitmap();
@@ -49,6 +51,7 @@
bool get(size_t) const;
void set(size_t);
+ void set(size_t, bool);
bool testAndSet(size_t);
bool testAndClear(size_t);
bool concurrentTestAndSet(size_t);
@@ -60,6 +63,18 @@
size_t count(size_t = 0) const;
size_t isEmpty() const;
size_t isFull() const;
+
+ void merge(const Bitmap&);
+ void filter(const Bitmap&);
+ void exclude(const Bitmap&);
+
+ template<typename Func>
+ void forEachSetBit(const Func&) const;
+
+ bool operator==(const Bitmap&) const;
+ bool operator!=(const Bitmap&) const;
+
+ unsigned hash() const;
private:
static const unsigned wordSize = sizeof(WordType) * 8;
@@ -94,6 +109,15 @@
}
template<size_t bitmapSize, BitmapAtomicMode atomicMode, typename WordType>
+inline void Bitmap<bitmapSize, atomicMode, WordType>::set(size_t n, bool value)
+{
+ if (value)
+ set(n);
+ else
+ clear(n);
+}
+
+template<size_t bitmapSize, BitmapAtomicMode atomicMode, typename WordType>
inline bool Bitmap<bitmapSize, atomicMode, WordType>::testAndSet(size_t n)
{
WordType mask = one << (n % wordSize);
@@ -224,5 +248,71 @@
return true;
}
+template<size_t bitmapSize, BitmapAtomicMode atomicMode, typename WordType>
+inline void Bitmap<bitmapSize, atomicMode, WordType>::merge(const Bitmap& other)
+{
+ for (size_t i = 0; i < words; ++i)
+ bits[i] |= other.bits[i];
}
+
+template<size_t bitmapSize, BitmapAtomicMode atomicMode, typename WordType>
+inline void Bitmap<bitmapSize, atomicMode, WordType>::filter(const Bitmap& other)
+{
+ for (size_t i = 0; i < words; ++i)
+ bits[i] &= other.bits[i];
+}
+
+template<size_t bitmapSize, BitmapAtomicMode atomicMode, typename WordType>
+inline void Bitmap<bitmapSize, atomicMode, WordType>::exclude(const Bitmap& other)
+{
+ for (size_t i = 0; i < words; ++i)
+ bits[i] &= ~other.bits[i];
+}
+
+template<size_t bitmapSize, BitmapAtomicMode atomicMode, typename WordType>
+template<typename Func>
+inline void Bitmap<bitmapSize, atomicMode, WordType>::forEachSetBit(const Func& func) const
+{
+ for (size_t i = 0; i < words; ++i) {
+ WordType word = bits[i];
+ if (!word)
+ continue;
+ size_t base = i * wordSize;
+ for (size_t j = 0; j < wordSize; ++j) {
+ if (word & 1)
+ func(base + j);
+ word >>= 1;
+ }
+ }
+}
+
+template<size_t bitmapSize, BitmapAtomicMode atomicMode, typename WordType>
+inline bool Bitmap<bitmapSize, atomicMode, WordType>::operator==(const Bitmap& other) const
+{
+ for (size_t i = 0; i < words; ++i) {
+ if (bits[i] != other.bits[i])
+ return false;
+ }
+ return true;
+}
+
+template<size_t bitmapSize, BitmapAtomicMode atomicMode, typename WordType>
+inline bool Bitmap<bitmapSize, atomicMode, WordType>::operator!=(const Bitmap& other) const
+{
+ return !(*this == other);
+}
+
+template<size_t bitmapSize, BitmapAtomicMode atomicMode, typename WordType>
+inline unsigned Bitmap<bitmapSize, atomicMode, WordType>::hash() const
+{
+ unsigned result = 0;
+ for (size_t i = 0; i < words; ++i)
+ result ^= IntHash<WordType>::hash(bits[i]);
+ return result;
+}
+
+} // namespace WTF
+
+using WTF::Bitmap;
+
#endif