Title: [183204] trunk
Revision
183204
Author
[email protected]
Date
2015-04-23 12:56:57 -0700 (Thu, 23 Apr 2015)

Log Message

Use less memory when compiling content extensions.
https://bugs.webkit.org/show_bug.cgi?id=144051

Patch by Alex Christensen <[email protected]> on 2015-04-23
Reviewed by Darin Adler and Benjamin Poulain.

Source/WebCore:

No change in functionality, correctness already covered by existing tests.

Before this patch, a DFANode contained a HashSet of transitions.
Large vectors of DFANodes made many small HashSets, which was inefficient use of memory.
We now put all the actions and transitions into one big compact Vector and each node owns ranges in that vector.

* contentextensions/CombinedURLFilters.cpp:
(WebCore::ContentExtensions::recursiveMemoryUsed):
(WebCore::ContentExtensions::CombinedURLFilters::memoryUsed):
* contentextensions/CombinedURLFilters.h:
* contentextensions/ContentExtensionCompiler.cpp:
(WebCore::ContentExtensions::compileRuleList):
* contentextensions/ContentExtensionsDebugging.h:
* contentextensions/DFA.cpp:
(WebCore::ContentExtensions::DFA::memoryUsed):
(WebCore::ContentExtensions::DFANode::actions):
(WebCore::ContentExtensions::DFANode::transitions):
(WebCore::ContentExtensions::DFANode::fallbackTransitionDestination):
(WebCore::ContentExtensions::DFANode::changeFallbackTransition):
(WebCore::ContentExtensions::DFANode::addFallbackTransition):
(WebCore::ContentExtensions::DFANode::containsTransition):
(WebCore::ContentExtensions::DFANode::kill):
(WebCore::ContentExtensions::DFA::minimize):
(WebCore::ContentExtensions::DFA::DFA): Deleted.
(WebCore::ContentExtensions::DFA::operator=): Deleted.
* contentextensions/DFA.h:
* contentextensions/DFABytecodeCompiler.cpp:
(WebCore::ContentExtensions::DFABytecodeCompiler::compileNode):
(WebCore::ContentExtensions::DFABytecodeCompiler::compileNodeTransitions):
(WebCore::ContentExtensions::DFABytecodeCompiler::compile):
* contentextensions/DFABytecodeCompiler.h:
* contentextensions/DFAMinimizer.cpp:
(WebCore::ContentExtensions::DFAMinimizer::minimize):
* contentextensions/DFAMinimizer.h:
* contentextensions/DFANode.h:
(WebCore::ContentExtensions::DFANode::isKilled):
(WebCore::ContentExtensions::DFANode::hasFallbackTransition):
(WebCore::ContentExtensions::DFANode::hasActions):
(WebCore::ContentExtensions::DFANode::transitionsLength):
(WebCore::ContentExtensions::DFANode::actionsLength):
(WebCore::ContentExtensions::DFANode::actionsStart):
(WebCore::ContentExtensions::DFANode::setActions):
(WebCore::ContentExtensions::DFANode::setTransitions):
(WebCore::ContentExtensions::DFANode::resetTransitions):
(WebCore::ContentExtensions::DFANode::transitionsStart):
(WebCore::ContentExtensions::DFANode::setHasFallbackTransitionWithoutChangingDFA):
* contentextensions/NFA.cpp:
(WebCore::ContentExtensions::NFA::memoryUsed):
* contentextensions/NFA.h:
* contentextensions/NFAToDFA.cpp:
(WebCore::ContentExtensions::NodeIdSetToUniqueNodeIdSetSource::NodeIdSetToUniqueNodeIdSetSource):
(WebCore::ContentExtensions::NodeIdSetToUniqueNodeIdSetTranslator::translate):
(WebCore::ContentExtensions::getOrCreateDFANode):
(WebCore::ContentExtensions::NFAToDFA::convert):

Tools:

* TestWebKitAPI/Tests/WebCore/ContentExtensions.cpp:
(TestWebKitAPI::TEST_F):
* TestWebKitAPI/Tests/WebCore/DFAMinimizer.cpp:
(TestWebKitAPI::countLiveNodes):

Modified Paths

Diff

Modified: trunk/Source/WebCore/ChangeLog (183203 => 183204)


--- trunk/Source/WebCore/ChangeLog	2015-04-23 19:45:00 UTC (rev 183203)
+++ trunk/Source/WebCore/ChangeLog	2015-04-23 19:56:57 UTC (rev 183204)
@@ -1,3 +1,65 @@
+2015-04-23  Alex Christensen  <[email protected]>
+
+        Use less memory when compiling content extensions.
+        https://bugs.webkit.org/show_bug.cgi?id=144051
+
+        Reviewed by Darin Adler and Benjamin Poulain.
+
+        No change in functionality, correctness already covered by existing tests.
+
+        Before this patch, a DFANode contained a HashSet of transitions.
+        Large vectors of DFANodes made many small HashSets, which was inefficient use of memory.
+        We now put all the actions and transitions into one big compact Vector and each node owns ranges in that vector.
+
+        * contentextensions/CombinedURLFilters.cpp:
+        (WebCore::ContentExtensions::recursiveMemoryUsed):
+        (WebCore::ContentExtensions::CombinedURLFilters::memoryUsed):
+        * contentextensions/CombinedURLFilters.h:
+        * contentextensions/ContentExtensionCompiler.cpp:
+        (WebCore::ContentExtensions::compileRuleList):
+        * contentextensions/ContentExtensionsDebugging.h:
+        * contentextensions/DFA.cpp:
+        (WebCore::ContentExtensions::DFA::memoryUsed):
+        (WebCore::ContentExtensions::DFANode::actions):
+        (WebCore::ContentExtensions::DFANode::transitions):
+        (WebCore::ContentExtensions::DFANode::fallbackTransitionDestination):
+        (WebCore::ContentExtensions::DFANode::changeFallbackTransition):
+        (WebCore::ContentExtensions::DFANode::addFallbackTransition):
+        (WebCore::ContentExtensions::DFANode::containsTransition):
+        (WebCore::ContentExtensions::DFANode::kill):
+        (WebCore::ContentExtensions::DFA::minimize):
+        (WebCore::ContentExtensions::DFA::DFA): Deleted.
+        (WebCore::ContentExtensions::DFA::operator=): Deleted.
+        * contentextensions/DFA.h:
+        * contentextensions/DFABytecodeCompiler.cpp:
+        (WebCore::ContentExtensions::DFABytecodeCompiler::compileNode):
+        (WebCore::ContentExtensions::DFABytecodeCompiler::compileNodeTransitions):
+        (WebCore::ContentExtensions::DFABytecodeCompiler::compile):
+        * contentextensions/DFABytecodeCompiler.h:
+        * contentextensions/DFAMinimizer.cpp:
+        (WebCore::ContentExtensions::DFAMinimizer::minimize):
+        * contentextensions/DFAMinimizer.h:
+        * contentextensions/DFANode.h:
+        (WebCore::ContentExtensions::DFANode::isKilled):
+        (WebCore::ContentExtensions::DFANode::hasFallbackTransition):
+        (WebCore::ContentExtensions::DFANode::hasActions):
+        (WebCore::ContentExtensions::DFANode::transitionsLength):
+        (WebCore::ContentExtensions::DFANode::actionsLength):
+        (WebCore::ContentExtensions::DFANode::actionsStart):
+        (WebCore::ContentExtensions::DFANode::setActions):
+        (WebCore::ContentExtensions::DFANode::setTransitions):
+        (WebCore::ContentExtensions::DFANode::resetTransitions):
+        (WebCore::ContentExtensions::DFANode::transitionsStart):
+        (WebCore::ContentExtensions::DFANode::setHasFallbackTransitionWithoutChangingDFA):
+        * contentextensions/NFA.cpp:
+        (WebCore::ContentExtensions::NFA::memoryUsed):
+        * contentextensions/NFA.h:
+        * contentextensions/NFAToDFA.cpp:
+        (WebCore::ContentExtensions::NodeIdSetToUniqueNodeIdSetSource::NodeIdSetToUniqueNodeIdSetSource):
+        (WebCore::ContentExtensions::NodeIdSetToUniqueNodeIdSetTranslator::translate):
+        (WebCore::ContentExtensions::getOrCreateDFANode):
+        (WebCore::ContentExtensions::NFAToDFA::convert):
+
 2015-04-23  David Hyatt  <[email protected]>
 
         Don't fire a bunch of mouse moveds during scrolling.

Modified: trunk/Source/WebCore/contentextensions/CombinedURLFilters.cpp (183203 => 183204)


--- trunk/Source/WebCore/contentextensions/CombinedURLFilters.cpp	2015-04-23 19:45:00 UTC (rev 183203)
+++ trunk/Source/WebCore/contentextensions/CombinedURLFilters.cpp	2015-04-23 19:56:57 UTC (rev 183204)
@@ -43,6 +43,21 @@
     bool inVariableLengthPrefix { false };
 };
 
+static size_t recursiveMemoryUsed(const std::unique_ptr<PrefixTreeVertex>& node)
+{
+    size_t size = sizeof(PrefixTreeVertex)
+        + node->edges.capacity() * sizeof(std::pair<Term, std::unique_ptr<PrefixTreeVertex>>)
+        + node->finalActions.capacity() * sizeof(uint64_t);
+    for (const auto& child : node->edges.values())
+        size += recursiveMemoryUsed(child);
+    return size;
+}
+
+size_t CombinedURLFilters::memoryUsed() const
+{
+    return recursiveMemoryUsed(m_prefixTreeRoot);
+}
+    
 CombinedURLFilters::CombinedURLFilters()
     : m_prefixTreeRoot(std::make_unique<PrefixTreeVertex>())
 {

Modified: trunk/Source/WebCore/contentextensions/CombinedURLFilters.h (183203 => 183204)


--- trunk/Source/WebCore/contentextensions/CombinedURLFilters.h	2015-04-23 19:45:00 UTC (rev 183203)
+++ trunk/Source/WebCore/contentextensions/CombinedURLFilters.h	2015-04-23 19:56:57 UTC (rev 183204)
@@ -47,6 +47,8 @@
     Vector<NFA> createNFAs() const;
     void clear();
 
+    size_t memoryUsed() const;
+    
 private:
     std::unique_ptr<PrefixTreeVertex> m_prefixTreeRoot;
 };

Modified: trunk/Source/WebCore/contentextensions/ContentExtensionCompiler.cpp (183203 => 183204)


--- trunk/Source/WebCore/contentextensions/ContentExtensionCompiler.cpp	2015-04-23 19:45:00 UTC (rev 183203)
+++ trunk/Source/WebCore/contentextensions/ContentExtensionCompiler.cpp	2015-04-23 19:56:57 UTC (rev 183204)
@@ -134,6 +134,7 @@
     Vector<SerializedActionByte> actions;
     Vector<unsigned> actionLocations = serializeActions(parsedRuleList, actions);
     client.writeActions(WTF::move(actions));
+    LOG_LARGE_STRUCTURES(actions, actions.capacity() * sizeof(SerializedActionByte));
     actions.clear();
 
     HashSet<uint64_t, DefaultHash<uint64_t>::Hash, WTF::UnsignedWithZeroKeyHashTraits<uint64_t>> universalActionLocations;
@@ -164,6 +165,8 @@
         if (contentExtensionRule.action().type() == ActionType::IgnorePreviousRules)
             ignorePreviousRulesSeen = true;
     }
+    LOG_LARGE_STRUCTURES(parsedRuleList, parsedRuleList.capacity() * sizeof(ContentExtensionRule)); // Doesn't include strings.
+    LOG_LARGE_STRUCTURES(actionLocations, actionLocations.capacity() * sizeof(unsigned));
     parsedRuleList.clear();
     actionLocations.clear();
 
@@ -177,6 +180,7 @@
 #endif
 
     Vector<NFA> nfas = combinedURLFilters.createNFAs();
+    LOG_LARGE_STRUCTURES(combinedURLFilters, combinedURLFilters.memoryUsed());
     combinedURLFilters.clear();
     if (!nfas.size() && universalActionLocations.size())
         nfas.append(NFA());
@@ -204,6 +208,7 @@
         double dfaBuildTimeStart = monotonicallyIncreasingTime();
 #endif
         DFA dfa = NFAToDFA::convert(nfas[i]);
+        LOG_LARGE_STRUCTURES(dfa, dfa.memoryUsed());
 #if CONTENT_EXTENSIONS_PERFORMANCE_REPORTING
         double dfaBuildTimeEnd = monotonicallyIncreasingTime();
         dataLogF("    Time spent building the DFA %zu: %f\n", i, (dfaBuildTimeEnd - dfaBuildTimeStart));
@@ -222,16 +227,22 @@
         WTFLogAlways("DFA %zu", i);
         dfa.debugPrintDot();
 #endif
-
-        ASSERT_WITH_MESSAGE(!dfa.nodeAt(dfa.root()).actions.size(), "All actions on the DFA root should come from regular expressions that match everything.");
+        ASSERT_WITH_MESSAGE(!dfa.nodes[dfa.root].hasActions(), "All actions on the DFA root should come from regular expressions that match everything.");
         if (!i) {
             // Put all the universal actions on the first DFA.
+            unsigned actionsStart = dfa.actions.size();
+            dfa.actions.reserveCapacity(dfa.actions.size() + universalActionLocations.size());
             for (uint64_t actionLocation : universalActionLocations)
-                dfa.nodeAt(dfa.root()).actions.append(actionLocation);
+                dfa.actions.uncheckedAppend(actionLocation);
+            unsigned actionsEnd = dfa.actions.size();
+            unsigned actionsLength = actionsEnd - actionsStart;
+            RELEASE_ASSERT_WITH_MESSAGE(actionsLength < std::numeric_limits<uint16_t>::max(), "Too many uncombined actions that match everything");
+            dfa.nodes[dfa.root].setActions(actionsStart, static_cast<uint16_t>(actionsLength));
         }
         DFABytecodeCompiler compiler(dfa, bytecode);
         compiler.compile();
     }
+    LOG_LARGE_STRUCTURES(universalActionLocations, universalActionLocations.capacity() * sizeof(unsigned));
     universalActionLocations.clear();
 
 #if CONTENT_EXTENSIONS_PERFORMANCE_REPORTING
@@ -240,9 +251,15 @@
     dataLogF("    Bytecode size %zu\n", bytecode.size());
     dataLogF("    DFA count %zu\n", nfas.size());
 #endif
+    size_t nfaMemoryUsed = 0;
+    for (const NFA& nfa : nfas)
+        nfaMemoryUsed += sizeof(NFA) + nfa.memoryUsed();
+    LOG_LARGE_STRUCTURES(nfas, nfaMemoryUsed);
     nfas.clear();
 
+    LOG_LARGE_STRUCTURES(bytecode, bytecode.capacity() * sizeof(uint8_t));
     client.writeBytecode(WTF::move(bytecode));
+    bytecode.clear();
 
     return { };
 }

Modified: trunk/Source/WebCore/contentextensions/ContentExtensionsDebugging.h (183203 => 183204)


--- trunk/Source/WebCore/contentextensions/ContentExtensionsDebugging.h	2015-04-23 19:45:00 UTC (rev 183203)
+++ trunk/Source/WebCore/contentextensions/ContentExtensionsDebugging.h	2015-04-23 19:56:57 UTC (rev 183204)
@@ -35,6 +35,12 @@
 #define CONTENT_EXTENSIONS_MEMORY_REPORTING 0
 #define CONTENT_EXTENSIONS_PAGE_SIZE 16384
 
+#if CONTENT_EXTENSIONS_PERFORMANCE_REPORTING
+#define LOG_LARGE_STRUCTURES(name, size) if (size > 1000000) { WTFLogAlways("NAME: %s SIZE %d", #name, (int)(size)); };
+#else
+#define LOG_LARGE_STRUCTURES(name, size)
+#endif
+
 #endif // ENABLE(CONTENT_EXTENSIONS)
 
 #endif // ContentExtensionsDebugging_h

Modified: trunk/Source/WebCore/contentextensions/DFA.cpp (183203 => 183204)


--- trunk/Source/WebCore/contentextensions/DFA.cpp	2015-04-23 19:45:00 UTC (rev 183203)
+++ trunk/Source/WebCore/contentextensions/DFA.cpp	2015-04-23 19:56:57 UTC (rev 183204)
@@ -35,34 +35,90 @@
 
 namespace ContentExtensions {
 
-DFA::DFA()
-    : m_root(0)
+size_t DFA::memoryUsed() const
 {
+    return sizeof(DFA)
+        + actions.size() * sizeof(uint64_t)
+        + transitions.size() * sizeof(std::pair<uint8_t, uint32_t>)
+        + nodes.size() * sizeof(DFANode);
 }
 
-DFA::DFA(Vector<DFANode>&& nodes, unsigned rootIndex)
-    : m_nodes(WTF::move(nodes))
-    , m_root(rootIndex)
+// FIXME: Make DFANode.cpp.
+Vector<uint64_t> DFANode::actions(const DFA& dfa) const
 {
-    ASSERT(rootIndex < m_nodes.size());
+    // FIXME: Use iterators instead of copying the Vector elements.
+    Vector<uint64_t> vector;
+    vector.reserveInitialCapacity(m_actionsLength);
+    for (uint32_t i = m_actionsStart; i < m_actionsStart + m_actionsLength; ++i)
+        vector.uncheckedAppend(dfa.actions[i]);
+    return vector;
 }
+    
+Vector<std::pair<uint8_t, uint32_t>> DFANode::transitions(const DFA& dfa) const
+{
+    // FIXME: Use iterators instead of copying the Vector elements.
+    Vector<std::pair<uint8_t, uint32_t>> vector;
+    vector.reserveInitialCapacity(transitionsLength());
+    for (uint32_t i = m_transitionsStart; i < m_transitionsStart + m_transitionsLength; ++i)
+        vector.uncheckedAppend(dfa.transitions[i]);
+    return vector;
+}
 
-DFA::DFA(const DFA& dfa)
-    : m_nodes(dfa.m_nodes)
-    , m_root(dfa.m_root)
+uint32_t DFANode::fallbackTransitionDestination(const DFA& dfa) const
 {
+    RELEASE_ASSERT(hasFallbackTransition());
+
+    // If there is a fallback transition, it is just after the other transitions and has an invalid ASCII character to mark it as a fallback transition.
+    ASSERT(dfa.transitions[m_transitionsStart + m_transitionsLength].first == std::numeric_limits<uint8_t>::max());
+    return dfa.transitions[m_transitionsStart + m_transitionsLength].second;
 }
 
-DFA& DFA::operator=(const DFA& dfa)
+void DFANode::changeFallbackTransition(DFA& dfa, uint32_t newDestination)
 {
-    m_nodes = dfa.m_nodes;
-    m_root = dfa.m_root;
-    return *this;
+    RELEASE_ASSERT(hasFallbackTransition());
+    ASSERT_WITH_MESSAGE(dfa.transitions[m_transitionsStart + m_transitionsLength].first == std::numeric_limits<uint8_t>::max(), "When changing a fallback transition, the fallback transition should already be marked as such");
+    dfa.transitions[m_transitionsStart + m_transitionsLength] = std::pair<uint8_t, uint32_t>(std::numeric_limits<uint8_t>::max(), newDestination);
 }
 
+void DFANode::addFallbackTransition(DFA& dfa, uint32_t destination)
+{
+    RELEASE_ASSERT_WITH_MESSAGE(dfa.transitions.size() == m_transitionsStart + m_transitionsLength, "Adding a fallback transition should only happen if the node is at the end");
+    dfa.transitions.append(std::pair<uint8_t, uint32_t>(std::numeric_limits<uint8_t>::max(), destination));
+    ASSERT(!(m_flags & HasFallbackTransition));
+    m_flags |= HasFallbackTransition;
+}
+
+bool DFANode::containsTransition(uint8_t transition, DFA& dfa)
+{
+    // Called from DFAMinimizer, this loops though a maximum of 128 transitions, so it's not too slow.
+    ASSERT(m_transitionsLength <= 128);
+    for (unsigned i = m_transitionsStart; i < m_transitionsStart + m_transitionsLength; ++i) {
+        if (dfa.transitions[i].first == transition)
+            return true;
+    }
+    return false;
+}
+    
+void DFANode::kill(DFA& dfa)
+{
+    ASSERT(m_flags != IsKilled);
+    m_flags = IsKilled; // Killed nodes don't have any other flags.
+    
+    // Invalidate the now-unused memory in the DFA to make finding bugs easier.
+    for (unsigned i = m_transitionsStart; i < m_transitionsStart + m_transitionsLength; ++i)
+        dfa.transitions[i] = std::make_pair(std::numeric_limits<uint8_t>::max(), std::numeric_limits<uint32_t>::max());
+    for (unsigned i = m_actionsStart; i < m_actionsStart + m_actionsLength; ++i)
+        dfa.actions[i] = std::numeric_limits<uint64_t>::max();
+
+    m_actionsStart = 0;
+    m_actionsLength = 0;
+    m_transitionsStart = 0;
+    m_transitionsLength = 0;
+};
+
 void DFA::minimize()
 {
-    m_root = DFAMinimizer::minimize(m_nodes, m_root);
+    DFAMinimizer::minimize(*this);
 }
 
 #if CONTENT_EXTENSIONS_STATE_MACHINE_DEBUGGING

Modified: trunk/Source/WebCore/contentextensions/DFA.h (183203 => 183204)


--- trunk/Source/WebCore/contentextensions/DFA.h	2015-04-23 19:45:00 UTC (rev 183203)
+++ trunk/Source/WebCore/contentextensions/DFA.h	2015-04-23 19:56:57 UTC (rev 183204)
@@ -37,28 +37,19 @@
 namespace ContentExtensions {
 
 // The DFA abstract a partial DFA graph in a compact form.
-class WEBCORE_EXPORT DFA {
-public:
-    DFA();
-    DFA(Vector<DFANode>&& nodes, unsigned rootIndex);
-    DFA(const DFA& dfa);
-
-    DFA& operator=(const DFA&);
-
-    unsigned root() const { return m_root; }
-    unsigned size() const { return m_nodes.size(); }
-    const DFANode& nodeAt(unsigned i) const { return m_nodes[i]; }
-    DFANode& nodeAt(unsigned i) { return m_nodes[i]; }
-
+struct WEBCORE_EXPORT DFA {
     void minimize();
+    size_t memoryUsed() const;
 
 #if CONTENT_EXTENSIONS_STATE_MACHINE_DEBUGGING
     void debugPrintDot() const;
 #endif
-
-private:
-    Vector<DFANode> m_nodes;
-    unsigned m_root;
+    
+    Vector<uint64_t> actions;
+    // FIXME: transitions could be two Vectors to save even more memory.
+    Vector<std::pair<uint8_t, uint32_t>> transitions;
+    Vector<DFANode> nodes;
+    unsigned root { 0 };
 };
 
 }

Modified: trunk/Source/WebCore/contentextensions/DFABytecodeCompiler.cpp (183203 => 183204)


--- trunk/Source/WebCore/contentextensions/DFABytecodeCompiler.cpp	2015-04-23 19:45:00 UTC (rev 183203)
+++ trunk/Source/WebCore/contentextensions/DFABytecodeCompiler.cpp	2015-04-23 19:56:57 UTC (rev 183204)
@@ -95,8 +95,8 @@
 
 void DFABytecodeCompiler::compileNode(unsigned index, bool root)
 {
-    const DFANode& node = m_dfa.nodeAt(index);
-    if (node.isKilled) {
+    const DFANode& node = m_dfa.nodes[index];
+    if (node.isKilled()) {
         m_nodeStartOffsets[index] = std::numeric_limits<unsigned>::max();
         return;
     }
@@ -105,7 +105,7 @@
     if (!root)
         m_nodeStartOffsets[index] = m_bytecode.size();
 
-    for (uint64_t action : node.actions) {
+    for (uint64_t action : node.actions(m_dfa)) {
         // High bits are used to store flags. See compileRuleList.
         if (action & 0xFFFF00000000)
             emitTestFlagsAndAppendAction(static_cast<uint16_t>(action >> 32), static_cast<unsigned>(action));
@@ -123,14 +123,14 @@
 
 void DFABytecodeCompiler::compileNodeTransitions(const DFANode& node)
 {
-    unsigned destinations[128];
-    const unsigned noDestination = std::numeric_limits<unsigned>::max();
-    for (uint16_t i = 0; i < 128; i++) {
-        auto it = node.transitions.find(i);
-        if (it == node.transitions.end())
-            destinations[i] = noDestination;
-        else
-            destinations[i] = it->value;
+    uint32_t destinations[128];
+    memset(destinations, 0xff, sizeof(destinations));
+    const uint32_t noDestination = std::numeric_limits<uint32_t>::max();
+
+    for (const auto& pair : node.transitions(m_dfa)) {
+        RELEASE_ASSERT(pair.first < WTF_ARRAY_LENGTH(destinations));
+        ASSERT_WITH_MESSAGE(destinations[pair.first] == noDestination, "Transitions should be unique");
+        destinations[pair.first] = pair.second;
     }
 
     Vector<Range> ranges;
@@ -138,7 +138,7 @@
     bool hasRangeMin = false;
     for (uint8_t i = 0; i < 128; i++) {
         if (hasRangeMin) {
-            ASSERT_WITH_MESSAGE(!(node.hasFallbackTransition && node.fallbackTransition == destinations[rangeMin]), "Individual transitions to the fallback transitions should have been eliminated by the optimizer.");
+            ASSERT_WITH_MESSAGE(!(node.hasFallbackTransition() && node.fallbackTransitionDestination(m_dfa) == destinations[rangeMin]), "Individual transitions to the fallback transitions should have been eliminated by the optimizer.");
             if (destinations[i] != destinations[rangeMin]) {
 
                 // This is the end of a range. Check if it can be case insensitive.
@@ -186,8 +186,8 @@
 
     for (const auto& range : ranges)
         compileCheckForRange(range);
-    if (node.hasFallbackTransition)
-        emitJump(node.fallbackTransition);
+    if (node.hasFallbackTransition())
+        emitJump(node.fallbackTransitionDestination(m_dfa));
     else
         emitTerminate();
 }
@@ -208,12 +208,12 @@
     // DFA header.
     unsigned startLocation = m_bytecode.size();
     append<unsigned>(m_bytecode, 0);
-    m_nodeStartOffsets.resize(m_dfa.size());
+    m_nodeStartOffsets.resize(m_dfa.nodes.size());
     
     // Make sure the root is always at the beginning of the bytecode.
-    compileNode(m_dfa.root(), true);
-    for (unsigned i = 0; i < m_dfa.size(); i++) {
-        if (i != m_dfa.root())
+    compileNode(m_dfa.root, true);
+    for (unsigned i = 0; i < m_dfa.nodes.size(); i++) {
+        if (i != m_dfa.root)
             compileNode(i, false);
     }
 

Modified: trunk/Source/WebCore/contentextensions/DFABytecodeCompiler.h (183203 => 183204)


--- trunk/Source/WebCore/contentextensions/DFABytecodeCompiler.h	2015-04-23 19:45:00 UTC (rev 183203)
+++ trunk/Source/WebCore/contentextensions/DFABytecodeCompiler.h	2015-04-23 19:56:57 UTC (rev 183204)
@@ -35,7 +35,7 @@
 
 namespace ContentExtensions {
 
-class DFA;
+struct DFA;
 class DFANode;
 
 class DFABytecodeCompiler {

Modified: trunk/Source/WebCore/contentextensions/DFAMinimizer.cpp (183203 => 183204)


--- trunk/Source/WebCore/contentextensions/DFAMinimizer.cpp	2015-04-23 19:45:00 UTC (rev 183203)
+++ trunk/Source/WebCore/contentextensions/DFAMinimizer.cpp	2015-04-23 19:56:57 UTC (rev 183204)
@@ -28,8 +28,12 @@
 
 #if ENABLE(CONTENT_EXTENSIONS)
 
+#include "DFA.h"
+#include "DFANode.h"
+#include <wtf/HashMap.h>
 #include <wtf/HashSet.h>
 #include <wtf/StringHasher.h>
+#include <wtf/Vector.h>
 
 namespace WebCore {
 namespace ContentExtensions {
@@ -39,20 +43,22 @@
 // simplifyTransitions() tries to collapse individual transitions into fallback transitions.
 // After simplifyTransitions(), you can also make the assumption that a fallback transition target will never be
 // also the target of an individual transition.
-static void simplifyTransitions(Vector<DFANode>& dfaGraph)
+static void simplifyTransitions(DFA& dfa)
 {
-    for (DFANode& dfaNode : dfaGraph) {
-        if (!dfaNode.hasFallbackTransition
-            && ((dfaNode.transitions.size() == 126 && !dfaNode.transitions.contains(0))
-                || (dfaNode.transitions.size() == 127 && dfaNode.transitions.contains(0)))) {
+    for (DFANode& dfaNode : dfa.nodes) {
+        bool addingFallback = false;
+        uint32_t newFallbackDestination = std::numeric_limits<uint32_t>::max();
+        if (!dfaNode.hasFallbackTransition()
+            && ((dfaNode.transitionsLength() == 126 && !dfaNode.containsTransition(0, dfa))
+                || (dfaNode.transitionsLength() == 127 && dfaNode.containsTransition(0, dfa)))) {
             unsigned bestTarget = std::numeric_limits<unsigned>::max();
             unsigned bestTargetScore = 0;
             HashMap<unsigned, unsigned, DefaultHash<unsigned>::Hash, WTF::UnsignedWithZeroKeyHashTraits<unsigned>> targetHistogram;
-            for (const auto& transition : dfaNode.transitions) {
-                if (!transition.key)
+            for (const auto& transition : dfaNode.transitions(dfa)) {
+                if (!transition.first)
                     continue;
 
-                unsigned transitionTarget = transition.value;
+                unsigned transitionTarget = transition.second;
                 auto addResult = targetHistogram.add(transitionTarget, 1);
                 if (!addResult.isNewEntry)
                     addResult.iterator->value++;
@@ -64,20 +70,44 @@
             }
             ASSERT_WITH_MESSAGE(bestTargetScore, "There should be at least one valid target since having transitions is a precondition to enter this path.");
 
-            dfaNode.hasFallbackTransition = true;
-            dfaNode.fallbackTransition = bestTarget;
+            newFallbackDestination = bestTarget;
+            addingFallback = true;
         }
+        
+        // Use the same location to write new transitions possibly followed by unused memory.
+        // We can do this because we are always decreasing the amount of memory used.
+        // We will probably need something like setHasFallbackTransitionWithoutChangingDFA to do that.
 
-        if (dfaNode.hasFallbackTransition) {
-            Vector<uint16_t, 128> keys;
-            DFANodeTransitions& transitions = dfaNode.transitions;
-            copyKeysToVector(transitions, keys);
-
-            for (uint16_t key : keys) {
-                auto transitionIterator = transitions.find(key);
-                if (transitionIterator->value == dfaNode.fallbackTransition)
-                    transitions.remove(transitionIterator);
+        unsigned oldFallbackTransition = std::numeric_limits<uint32_t>::max();
+        bool hadFallbackTransition = dfaNode.hasFallbackTransition();
+        if (hadFallbackTransition)
+            oldFallbackTransition = dfaNode.fallbackTransitionDestination(dfa);
+        
+        newFallbackDestination = (newFallbackDestination == std::numeric_limits<uint32_t>::max() ? oldFallbackTransition : newFallbackDestination);
+        ASSERT(!addingFallback || (newFallbackDestination != std::numeric_limits<uint32_t>::max() && oldFallbackTransition == std::numeric_limits<uint32_t>::max()));
+        bool willHaveFallback = newFallbackDestination != std::numeric_limits<uint32_t>::max();
+        dfaNode.setHasFallbackTransitionWithoutChangingDFA(willHaveFallback);
+        
+        if (willHaveFallback) {
+            Vector<std::pair<uint8_t, uint32_t>> transitions = dfaNode.transitions(dfa);
+            unsigned availableSlotCount = transitions.size() + hadFallbackTransition;
+            for (unsigned i = 0; i < transitions.size(); ++i) {
+                if (transitions[i].second == newFallbackDestination)
+                    transitions.remove(i--);
             }
+        
+            RELEASE_ASSERT(transitions.size() + willHaveFallback <= availableSlotCount);
+        
+            unsigned firstSlot = dfaNode.transitionsStart();
+            dfaNode.resetTransitions(firstSlot, transitions.size());
+            for (unsigned i = 0; i < transitions.size(); ++i)
+                dfa.transitions[firstSlot + i] = transitions[i];
+            for (unsigned i = transitions.size(); i < availableSlotCount; ++i) {
+                // Invalidate now-unused memory to make finding bugs easier.
+                dfa.transitions[firstSlot + i] = std::make_pair(std::numeric_limits<uint8_t>::max(), std::numeric_limits<uint32_t>::max());
+            }
+            if (willHaveFallback)
+                dfa.transitions[firstSlot + transitions.size()] = std::make_pair(std::numeric_limits<uint8_t>::max(), newFallbackDestination);
         }
     }
 }
@@ -221,24 +251,24 @@
 
 class FullGraphPartition {
 public:
-    FullGraphPartition(const Vector<DFANode>& dfaGraph)
+    FullGraphPartition(const DFA& dfa)
     {
-        m_nodePartition.initialize(dfaGraph.size());
+        m_nodePartition.initialize(dfa.nodes.size());
 
-        m_flattenedTransitionsStartOffsetPerNode.resize(dfaGraph.size());
+        m_flattenedTransitionsStartOffsetPerNode.resize(dfa.nodes.size());
         for (unsigned& counter : m_flattenedTransitionsStartOffsetPerNode)
             counter = 0;
 
-        m_flattenedFallbackTransitionsStartOffsetPerNode.resize(dfaGraph.size());
+        m_flattenedFallbackTransitionsStartOffsetPerNode.resize(dfa.nodes.size());
         for (unsigned& counter : m_flattenedFallbackTransitionsStartOffsetPerNode)
             counter = 0;
 
         // Count the number of incoming transitions per node.
-        for (const DFANode& dfaNode : dfaGraph) {
-            for (const auto& transition : dfaNode.transitions)
-                ++m_flattenedTransitionsStartOffsetPerNode[transition.value];
-            if (dfaNode.hasFallbackTransition)
-                ++m_flattenedFallbackTransitionsStartOffsetPerNode[dfaNode.fallbackTransition];
+        for (const DFANode& dfaNode : dfa.nodes) {
+            for (const auto& transition : dfaNode.transitions(dfa))
+                ++m_flattenedTransitionsStartOffsetPerNode[transition.second];
+            if (dfaNode.hasFallbackTransition())
+                ++m_flattenedFallbackTransitionsStartOffsetPerNode[dfaNode.fallbackTransitionDestination(dfa)];
         }
 
         // Accumulate the offsets.
@@ -259,11 +289,11 @@
         unsigned flattenedFallbackTransitionsSize = fallbackTransitionAccumulator;
 
         // Next, let's fill the transition table and set up the size of each group at the same time.
-        m_flattenedTransitionsSizePerNode.resize(dfaGraph.size());
+        m_flattenedTransitionsSizePerNode.resize(dfa.nodes.size());
         for (unsigned& counter : m_flattenedTransitionsSizePerNode)
             counter = 0;
 
-        m_flattenedFallbackTransitionsSizePerNode.resize(dfaGraph.size());
+        m_flattenedFallbackTransitionsSizePerNode.resize(dfa.nodes.size());
         for (unsigned& counter : m_flattenedFallbackTransitionsSizePerNode)
             counter = 0;
 
@@ -271,20 +301,20 @@
 
         m_flattenedFallbackTransitions.resize(flattenedFallbackTransitionsSize);
 
-        for (unsigned i = 0; i < dfaGraph.size(); ++i) {
-            const DFANode& dfaNode = dfaGraph[i];
-            for (const auto& transition : dfaNode.transitions) {
-                unsigned targetNodeIndex = transition.value;
+        for (unsigned i = 0; i < dfa.nodes.size(); ++i) {
+            const DFANode& dfaNode = dfa.nodes[i];
+            for (const auto& transition : dfaNode.transitions(dfa)) {
+                unsigned targetNodeIndex = transition.second;
 
                 unsigned start = m_flattenedTransitionsStartOffsetPerNode[targetNodeIndex];
                 unsigned offset = m_flattenedTransitionsSizePerNode[targetNodeIndex];
 
-                m_flattenedTransitions[start + offset] = Transition({ i, targetNodeIndex, transition.key });
+                m_flattenedTransitions[start + offset] = Transition({ i, targetNodeIndex, transition.first });
 
                 ++m_flattenedTransitionsSizePerNode[targetNodeIndex];
             }
-            if (dfaNode.hasFallbackTransition) {
-                unsigned targetNodeIndex = dfaNode.fallbackTransition;
+            if (dfaNode.hasFallbackTransition()) {
+                unsigned targetNodeIndex = dfaNode.fallbackTransitionDestination(dfa);
 
                 unsigned start = m_flattenedFallbackTransitionsStartOffsetPerNode[targetNodeIndex];
                 unsigned offset = m_flattenedFallbackTransitionsSizePerNode[targetNodeIndex];
@@ -403,20 +433,25 @@
     enum EmptyValueTag { EmptyValue };
     explicit ActionKey(EmptyValueTag) { state = Empty; }
 
-    explicit ActionKey(const Vector<uint64_t>& actions)
-        : actions(&actions)
-        , state(Valid)
+    explicit ActionKey(const DFA* dfa, uint32_t actionsStart, uint16_t actionsLength)
+        : dfa(dfa)
+        , actionsStart(actionsStart)
+        , actionsLength(actionsLength)
     {
         StringHasher hasher;
-        hasher.addCharactersAssumingAligned(reinterpret_cast<const UChar*>(actions.data()), actions.size() * sizeof(uint64_t) / sizeof(UChar));
+        hasher.addCharactersAssumingAligned(reinterpret_cast<const UChar*>(&dfa->actions[actionsStart]), actionsLength * sizeof(uint64_t) / sizeof(UChar));
         hash = hasher.hash();
     }
 
     bool isEmptyValue() const { return state == Empty; }
     bool isDeletedValue() const { return state == Deleted; }
 
-    const Vector<uint64_t>* actions;
     unsigned hash;
+    
+    const DFA* dfa;
+    uint32_t actionsStart;
+    uint16_t actionsLength;
+    
     enum {
         Valid,
         Empty,
@@ -430,9 +465,18 @@
         return actionKey.hash;
     }
 
-    static bool equal(const ActionKey& a, const ActionKey& b)
+    // FIXME: Release builds on Mavericks fail with this inlined.
+    __attribute__((noinline)) static bool equal(const ActionKey& a, const ActionKey& b)
     {
-        return a.state == b.state && *a.actions == *b.actions;
+        if (a.state != b.state
+            || a.dfa != b.dfa
+            || a.actionsLength != b.actionsLength)
+            return false;
+        for (uint16_t i = 0; i < a.actionsLength; ++i) {
+            if (a.dfa->actions[a.actionsStart + i] != a.dfa->actions[b.actionsStart + i])
+                return false;
+        }
+        return true;
     }
     static const bool safeToCompareToEmptyOrDeleted = false;
 };
@@ -443,21 +487,21 @@
 
 } // anonymous namespace.
 
-unsigned DFAMinimizer::minimize(Vector<DFANode>& dfaGraph, unsigned root)
+void DFAMinimizer::minimize(DFA& dfa)
 {
-    simplifyTransitions(dfaGraph);
+    simplifyTransitions(dfa);
 
-    FullGraphPartition fullGraphPartition(dfaGraph);
+    FullGraphPartition fullGraphPartition(dfa);
 
     // Unlike traditional minimization final states can be differentiated by their action.
     // Instead of creating a single set for the final state, we partition by actions from
     // the start.
     HashMap<ActionKey, Vector<unsigned>, ActionKeyHash, ActionKeyHashTraits> finalStates;
-    for (unsigned i = 0; i < dfaGraph.size(); ++i) {
-        Vector<uint64_t>& actions = dfaGraph[i].actions;
-        if (actions.size()) {
-            std::sort(actions.begin(), actions.end());
-            auto addResult = finalStates.add(ActionKey(actions), Vector<unsigned>());
+    for (unsigned i = 0; i < dfa.nodes.size(); ++i) {
+        const DFANode& node = dfa.nodes[i];
+        if (node.hasActions()) {
+            // FIXME: Sort the actions in the dfa to make nodes that have the same actions in different order equal.
+            auto addResult = finalStates.add(ActionKey(&dfa, node.actionsStart(), node.actionsLength()), Vector<unsigned>());
             addResult.iterator->value.append(i);
         }
     }
@@ -481,36 +525,32 @@
     fullGraphPartition.splitByFallbackTransitions();
 
     Vector<unsigned> relocationVector;
-    relocationVector.reserveInitialCapacity(dfaGraph.size());
-    for (unsigned i = 0; i < dfaGraph.size(); ++i)
-        relocationVector.append(i);
+    relocationVector.reserveInitialCapacity(dfa.nodes.size());
+    for (unsigned i = 0; i < dfa.nodes.size(); ++i)
+        relocationVector.uncheckedAppend(i);
 
     // Kill the useless nodes and keep track of the new node transitions should point to.
-    for (unsigned i = 0; i < dfaGraph.size(); ++i) {
+    for (unsigned i = 0; i < dfa.nodes.size(); ++i) {
         unsigned replacement = fullGraphPartition.nodeReplacement(i);
         if (i != replacement) {
             relocationVector[i] = replacement;
-
-            DFANode& node = dfaGraph[i];
-            node.actions.clear();
-            node.transitions.clear();
-            node.hasFallbackTransition = false;
-            node.isKilled = true;
+            dfa.nodes[i].kill(dfa);
         }
     }
 
-    for (DFANode& node : dfaGraph) {
-        for (auto& transition : node.transitions)
-            transition.value = relocationVector[transition.value];
-        if (node.hasFallbackTransition)
-            node.fallbackTransition = relocationVector[node.fallbackTransition];
+    for (DFANode& node : dfa.nodes) {
+        auto nodeTransitions = node.transitions(dfa);
+        for (unsigned i = 0; i < node.transitionsLength(); ++i)
+            dfa.transitions[node.transitionsStart() + i].second = relocationVector[nodeTransitions[i].second];
+        if (node.hasFallbackTransition())
+            node.changeFallbackTransition(dfa, relocationVector[node.fallbackTransitionDestination(dfa)]);
     }
 
     // After minimizing, there is no guarantee individual transition are still poiting to different states.
     // The state pointed by one individual transition and the fallback states may have been merged.
-    simplifyTransitions(dfaGraph);
+    simplifyTransitions(dfa);
 
-    return relocationVector[root];
+    dfa.root = relocationVector[dfa.root];
 }
 
 } // namespace ContentExtensions

Modified: trunk/Source/WebCore/contentextensions/DFAMinimizer.h (183203 => 183204)


--- trunk/Source/WebCore/contentextensions/DFAMinimizer.h	2015-04-23 19:45:00 UTC (rev 183203)
+++ trunk/Source/WebCore/contentextensions/DFAMinimizer.h	2015-04-23 19:56:57 UTC (rev 183204)
@@ -28,15 +28,14 @@
 
 #if ENABLE(CONTENT_EXTENSIONS)
 
-#include "DFANode.h"
-#include <wtf/Vector.h>
-
 namespace WebCore {
 namespace ContentExtensions {
 
+struct DFA;
+
 class DFAMinimizer {
 public:
-    WEBCORE_EXPORT static unsigned minimize(Vector<DFANode>& dfaGraph, unsigned rootNode);
+    WEBCORE_EXPORT static void minimize(DFA&);
 };
 
 } // namespace ContentExtensions

Modified: trunk/Source/WebCore/contentextensions/DFANode.h (183203 => 183204)


--- trunk/Source/WebCore/contentextensions/DFANode.h	2015-04-23 19:45:00 UTC (rev 183203)
+++ trunk/Source/WebCore/contentextensions/DFANode.h	2015-04-23 19:56:57 UTC (rev 183204)
@@ -36,24 +36,74 @@
 
 namespace ContentExtensions {
 
+struct DFA;
+
 // A DFANode abstract the transition table out of a DFA state. If a state is accepting, the DFANode also have
 // the actions for that state.
-
-typedef HashMap<uint16_t, unsigned, DefaultHash<uint16_t>::Hash, WTF::UnsignedWithZeroKeyHashTraits<uint16_t>> DFANodeTransitions;
-
 class DFANode {
 public:
-    DFANodeTransitions transitions;
-    bool hasFallbackTransition { false };
-    bool isKilled { false };
-    unsigned fallbackTransition;
-    Vector<uint64_t> actions;
-
+    // FIXME: Stop minimizing killed nodes and add ASSERT(!isKilled()) in many functions here.
+    void kill(DFA&);
+    Vector<uint64_t> actions(const DFA&) const;
+    Vector<std::pair<uint8_t, uint32_t>> transitions(const DFA&) const;
+    uint32_t fallbackTransitionDestination(const DFA&) const;
+    void addFallbackTransition(DFA&, uint32_t destination);
+    bool containsTransition(uint8_t, DFA&);
+    void changeFallbackTransition(DFA&, uint32_t newDestination);
+    
+    bool isKilled() const { return m_flags & IsKilled; }
+    bool hasFallbackTransition() const { return m_flags & HasFallbackTransition; }
+    bool hasActions() const { return !!m_actionsLength; }
+    uint8_t transitionsLength() const { return m_transitionsLength; }
+    uint16_t actionsLength() const { return m_actionsLength; }
+    uint32_t actionsStart() const { return m_actionsStart; }
+    uint32_t transitionsStart() const { return m_transitionsStart; }
+    
+    void setActions(uint32_t start, uint16_t length)
+    {
+        ASSERT(!m_actionsStart);
+        ASSERT(!m_actionsLength);
+        m_actionsStart = start;
+        m_actionsLength = length;
+    }
+    void setTransitions(uint32_t start, uint16_t length)
+    {
+        ASSERT(!m_transitionsStart);
+        ASSERT(!m_transitionsLength);
+        m_transitionsStart = start;
+        m_transitionsLength = length;
+    }
+    void resetTransitions(uint32_t start, uint16_t length)
+    {
+        m_transitionsStart = start;
+        m_transitionsLength = length;
+    }
+    void setHasFallbackTransitionWithoutChangingDFA(bool shouldHaveFallbackTransition)
+    {
+        if (shouldHaveFallbackTransition)
+            m_flags |= HasFallbackTransition;
+        else
+            m_flags &= ~HasFallbackTransition;
+    }
+    
 #if CONTENT_EXTENSIONS_STATE_MACHINE_DEBUGGING
     Vector<unsigned> correspondingNFANodes;
 #endif
+private:
+    uint32_t m_actionsStart { 0 };
+    uint32_t m_transitionsStart { 0 };
+    uint16_t m_actionsLength { 0 };
+    uint8_t m_transitionsLength { 0 };
+    
+    uint8_t m_flags { 0 };
+    const uint8_t HasFallbackTransition = 0x01;
+    const uint8_t IsKilled = 0x02;
 };
 
+// FIXME: Pack this down to 12.
+// It's probably already 12 on ARMv7.
+COMPILE_ASSERT(sizeof(DFANode) <= 16, Keep the DFANodes small);
+
 }
 
 } // namespace WebCore

Modified: trunk/Source/WebCore/contentextensions/NFA.cpp (183203 => 183204)


--- trunk/Source/WebCore/contentextensions/NFA.cpp	2015-04-23 19:45:00 UTC (rev 183203)
+++ trunk/Source/WebCore/contentextensions/NFA.cpp	2015-04-23 19:56:57 UTC (rev 183204)
@@ -47,6 +47,18 @@
     return nextId;
 }
 
+size_t NFA::memoryUsed() const
+{
+    size_t size = 0;
+    for (const NFANode& node : m_nodes) {
+        size += sizeof(node)
+            + node.transitions.capacity() * sizeof(std::pair<uint16_t, NFANodeIndexSet>)
+            + node.transitionsOnAnyCharacter.capacity() * sizeof(unsigned)
+            + node.finalRuleIds.size() * sizeof(uint64_t);
+    }
+    return size;
+}
+
 void NFA::addTransition(unsigned from, unsigned to, char character)
 {
     ASSERT(from < m_nodes.size());

Modified: trunk/Source/WebCore/contentextensions/NFA.h (183203 => 183204)


--- trunk/Source/WebCore/contentextensions/NFA.h	2015-04-23 19:45:00 UTC (rev 183203)
+++ trunk/Source/WebCore/contentextensions/NFA.h	2015-04-23 19:56:57 UTC (rev 183204)
@@ -63,6 +63,7 @@
 #else
     void addRuleId(unsigned, uint64_t) { }
 #endif
+    size_t memoryUsed() const;
 
 private:
     friend class NFAToDFA;

Modified: trunk/Source/WebCore/contentextensions/NFAToDFA.cpp (183203 => 183204)


--- trunk/Source/WebCore/contentextensions/NFAToDFA.cpp	2015-04-23 19:45:00 UTC (rev 183203)
+++ trunk/Source/WebCore/contentextensions/NFAToDFA.cpp	2015-04-23 19:56:57 UTC (rev 183204)
@@ -214,8 +214,8 @@
 typedef HashSet<std::unique_ptr<UniqueNodeIdSet>, UniqueNodeIdSetHash, UniqueNodeIdSetHashHashTraits> UniqueNodeIdSetTable;
 
 struct NodeIdSetToUniqueNodeIdSetSource {
-    NodeIdSetToUniqueNodeIdSetSource(Vector<DFANode>& dfaGraph, const Vector<NFANode>& nfaGraph, const NodeIdSet& nodeIdSet)
-        : dfaGraph(dfaGraph)
+    NodeIdSetToUniqueNodeIdSetSource(DFA& dfa, const Vector<NFANode>& nfaGraph, const NodeIdSet& nodeIdSet)
+        : dfa(dfa)
         , nfaGraph(nfaGraph)
         , nodeIdSet(nodeIdSet)
     {
@@ -225,7 +225,7 @@
             hash += nodeId;
         this->hash = DefaultHash<unsigned>::Hash::hash(hash);
     }
-    Vector<DFANode>& dfaGraph;
+    DFA& dfa;
     const Vector<NFANode>& nfaGraph;
     const NodeIdSet& nodeIdSet;
     unsigned hash;
@@ -254,10 +254,17 @@
             newDFANode.correspondingNFANodes.append(nfaNodeId);
 #endif
         }
-        copyToVector(actions, newDFANode.actions);
+        
+        unsigned actionsStart = source.dfa.actions.size();
+        for (uint64_t action : actions)
+            source.dfa.actions.append(action);
+        unsigned actionsEnd = source.dfa.actions.size();
+        unsigned actionsLength = actionsEnd - actionsStart;
+        RELEASE_ASSERT_WITH_MESSAGE(actionsLength <= std::numeric_limits<uint16_t>::max(), "Too many actions for the current DFANode size.");
+        newDFANode.setActions(actionsStart, static_cast<uint16_t>(actionsLength));
 
-        unsigned dfaNodeId = source.dfaGraph.size();
-        source.dfaGraph.append(newDFANode);
+        unsigned dfaNodeId = source.dfa.nodes.size();
+        source.dfa.nodes.append(newDFANode);
         new (NotNull, &location) UniqueNodeIdSet(source.nodeIdSet, hash, dfaNodeId);
 
         ASSERT(location.impl());
@@ -328,9 +335,9 @@
     }
 }
 
-static ALWAYS_INLINE unsigned getOrCreateDFANode(const NodeIdSet& nfaNodeSet, const Vector<NFANode>& nfaGraph, Vector<DFANode>& dfaGraph, UniqueNodeIdSetTable& uniqueNodeIdSetTable, Vector<UniqueNodeIdSetImpl*>& unprocessedNodes)
+static ALWAYS_INLINE unsigned getOrCreateDFANode(const NodeIdSet& nfaNodeSet, const Vector<NFANode>& nfaGraph, DFA& dfa, UniqueNodeIdSetTable& uniqueNodeIdSetTable, Vector<UniqueNodeIdSetImpl*>& unprocessedNodes)
 {
-    NodeIdSetToUniqueNodeIdSetSource nodeIdSetToUniqueNodeIdSetSource(dfaGraph, nfaGraph, nfaNodeSet);
+    NodeIdSetToUniqueNodeIdSetSource nodeIdSetToUniqueNodeIdSetSource(dfa, nfaGraph, nfaNodeSet);
     auto uniqueNodeIdAddResult = uniqueNodeIdSetTable.add<NodeIdSetToUniqueNodeIdSetTranslator>(nodeIdSetToUniqueNodeIdSetSource);
     if (uniqueNodeIdAddResult.isNewEntry)
         unprocessedNodes.append(uniqueNodeIdAddResult.iterator->impl());
@@ -340,6 +347,7 @@
 
 DFA NFAToDFA::convert(NFA& nfa)
 {
+    DFA dfa;
     Vector<NFANode>& nfaGraph = nfa.m_nodes;
 
     Vector<Vector<unsigned>> nfaNodeClosures;
@@ -350,8 +358,7 @@
 
     UniqueNodeIdSetTable uniqueNodeIdSetTable;
 
-    Vector<DFANode> dfaGraph;
-    NodeIdSetToUniqueNodeIdSetSource initialNodeIdSetToUniqueNodeIdSetSource(dfaGraph, nfaGraph, initialSet);
+    NodeIdSetToUniqueNodeIdSetSource initialNodeIdSetToUniqueNodeIdSetSource(dfa, nfaGraph, initialSet);
     auto addResult = uniqueNodeIdSetTable.add<NodeIdSetToUniqueNodeIdSetTranslator>(initialNodeIdSetToUniqueNodeIdSetSource);
 
     Vector<UniqueNodeIdSetImpl*> unprocessedNodes;
@@ -366,28 +373,32 @@
         NodeIdSet setFallbackTransition;
         populateTransitions(transitionsFromClosedSet, setFallbackTransition, *uniqueNodeIdSetImpl, nfaGraph, nfaNodeClosures);
 
+        unsigned transitionsStart = dfa.transitions.size();
         for (unsigned key = 0; key < transitionsFromClosedSet.size(); ++key) {
             NodeIdSet& targetNodeSet = transitionsFromClosedSet[key];
 
             if (targetNodeSet.isEmpty())
                 continue;
 
-            unsigned targetNodeId = getOrCreateDFANode(targetNodeSet, nfaGraph, dfaGraph, uniqueNodeIdSetTable, unprocessedNodes);
-            const auto addResult = dfaGraph[dfaNodeId].transitions.add(key, targetNodeId);
-            ASSERT_UNUSED(addResult, addResult.isNewEntry);
+            unsigned targetNodeId = getOrCreateDFANode(targetNodeSet, nfaGraph, dfa, uniqueNodeIdSetTable, unprocessedNodes);
+            RELEASE_ASSERT(key <= 127);
+            dfa.transitions.append(std::make_pair(static_cast<uint8_t>(key), targetNodeId));
 
             targetNodeSet.clear();
         }
+        unsigned transitionsEnd = dfa.transitions.size();
+        unsigned transitionsLength = transitionsEnd - transitionsStart;
+        RELEASE_ASSERT(transitionsLength <= 127);
+        dfa.nodes[dfaNodeId].setTransitions(transitionsStart, static_cast<uint8_t>(transitionsLength));
+        
         if (!setFallbackTransition.isEmpty()) {
-            unsigned targetNodeId = getOrCreateDFANode(setFallbackTransition, nfaGraph, dfaGraph, uniqueNodeIdSetTable, unprocessedNodes);
-            DFANode& dfaSourceNode = dfaGraph[dfaNodeId];
-            ASSERT(!dfaSourceNode.hasFallbackTransition);
-            dfaSourceNode.hasFallbackTransition = true;
-            dfaSourceNode.fallbackTransition = targetNodeId;
+            unsigned targetNodeId = getOrCreateDFANode(setFallbackTransition, nfaGraph, dfa, uniqueNodeIdSetTable, unprocessedNodes);
+            DFANode& dfaSourceNode = dfa.nodes[dfaNodeId];
+            dfaSourceNode.addFallbackTransition(dfa, targetNodeId);
         }
     } while (!unprocessedNodes.isEmpty());
 
-    return DFA(WTF::move(dfaGraph), 0);
+    return dfa;
 }
 
 } // namespace ContentExtensions

Modified: trunk/Tools/ChangeLog (183203 => 183204)


--- trunk/Tools/ChangeLog	2015-04-23 19:45:00 UTC (rev 183203)
+++ trunk/Tools/ChangeLog	2015-04-23 19:56:57 UTC (rev 183204)
@@ -1,3 +1,15 @@
+2015-04-23  Alex Christensen  <[email protected]>
+
+        Use less memory when compiling content extensions.
+        https://bugs.webkit.org/show_bug.cgi?id=144051
+
+        Reviewed by Darin Adler and Benjamin Poulain.
+
+        * TestWebKitAPI/Tests/WebCore/ContentExtensions.cpp:
+        (TestWebKitAPI::TEST_F):
+        * TestWebKitAPI/Tests/WebCore/DFAMinimizer.cpp:
+        (TestWebKitAPI::countLiveNodes):
+
 2015-04-22  Michael Catanzaro  <[email protected]>
 
         [CMake] Clean up JSC JIT options

Modified: trunk/Tools/TestWebKitAPI/Tests/WebCore/ContentExtensions.cpp (183203 => 183204)


--- trunk/Tools/TestWebKitAPI/Tests/WebCore/ContentExtensions.cpp	2015-04-23 19:45:00 UTC (rev 183203)
+++ trunk/Tools/TestWebKitAPI/Tests/WebCore/ContentExtensions.cpp	2015-04-23 19:56:57 UTC (rev 183204)
@@ -487,6 +487,7 @@
 TEST_F(ContentExtensionTest, MultiDFA)
 {
     // Make an NFA with about 1400 nodes.
+    // FIXME: This does not make multiple DFAs anymore. Add a test that does.
     StringBuilder ruleList;
     ruleList.append('[');
     for (char c1 = 'A'; c1 <= 'Z'; ++c1) {

Modified: trunk/Tools/TestWebKitAPI/Tests/WebCore/DFAMinimizer.cpp (183203 => 183204)


--- trunk/Tools/TestWebKitAPI/Tests/WebCore/DFAMinimizer.cpp	2015-04-23 19:45:00 UTC (rev 183203)
+++ trunk/Tools/TestWebKitAPI/Tests/WebCore/DFAMinimizer.cpp	2015-04-23 19:56:57 UTC (rev 183204)
@@ -46,8 +46,8 @@
 unsigned countLiveNodes(const ContentExtensions::DFA& dfa)
 {
     unsigned counter = 0;
-    for (unsigned i = 0; i < dfa.size(); ++i) {
-        if (!dfa.nodeAt(i).isKilled)
+    for (const auto& node : dfa.nodes) {
+        if (!node.isKilled())
             ++counter;
     }
     return counter;
_______________________________________________
webkit-changes mailing list
[email protected]
https://lists.webkit.org/mailman/listinfo/webkit-changes

Reply via email to