Branch: refs/heads/main
  Home:   https://github.com/WebKit/WebKit
  Commit: d4f884d21c0e96b8ec7d1eb289343cf221cbd016
      
https://github.com/WebKit/WebKit/commit/d4f884d21c0e96b8ec7d1eb289343cf221cbd016
  Author: Yusuke Suzuki <[email protected]>
  Date:   2026-02-11 (Wed, 11 Feb 2026)

  Changed paths:
    A JSTests/stress/regexp-fixed-count-multi-alt-backtracking.js
    A JSTests/stress/regexp-nested-quantified-parens-backtrack.js
    M Source/JavaScriptCore/runtime/OptionsList.h
    M Source/JavaScriptCore/yarr/YarrJIT.cpp

  Log Message:
  -----------
  [YARR] Need to use stored backtracking address from NestedAlternativeEnd
rdar://170038644
https://bugs.webkit.org/show_bug.cgi?id=307532

Reviewed by Yijia Huang.

Let's have Pattern: (a|b|c(){2}$){2}$ with input 'aabbbccc'

When the first iteration of `(a|b|c(){2}$)` can run successfully with
'a'. And the second iteration works well too with 'a'. But after that, $
fails because there is 'bbbccc' after 'aa'.

So, we backtrack to the second iteration. Since 'a' is just a character
and no way to backtrack again, we need to go to the next alternative
'b'. And this fails too, and next alternative 'c' fails too.

At this point, we go back to the previous first iteration. And we go to
NestedAlternativeEnd with the previous iteration's context (via
restoring of paren context).

To make it easily debuggable, we added trace logging via
Options::traceRegExpJITExecution. This dumps the execution log of the
above like this. Previously, we are wrongly jumping to 
ParenthesesSubpatternBegin.bt
from NestedAlternativeEnd.bt because we were not using stored
returnAddress jump. But this is fixed in this patch.

      === TRY MATCH AT POSITION 0 ===

      RegExpJIT [0] BodyAlternativeBegin index=0          # Start matching at 
pos 0
      RegExpJIT [1] ParenthesesSubpatternBegin index=0    # Outer paren iter1: 
try at pos 0
      RegExpJIT [1] ParenthesesSubpatternBegin index=1    # Outer paren iter2: 
try at pos 1 ('a' matched iter1)
      RegExpJIT [14] ParenthesesSubpatternEnd index=2     # iter2 completed, 
now at pos 2 ('a' matched iter2)
      RegExpJIT [14] ParenthesesSubpatternEnd.bt index=2  # '$' at pos 2 failed 
('b' != end), backtrack
      RegExpJIT [13] NestedAlternativeEnd.bt index=2      # Jump via 
returnAddress to content backtrack
      RegExpJIT [2] NestedAlternativeBegin.bt index=2     # 'a' content 
backtrack fails (nothing to give up)
      RegExpJIT [4] NestedAlternativeNext index=1         # Try 'b' at pos 1 
(with my fix: beginIndex restored)
      RegExpJIT [4] NestedAlternativeNext.bt index=2      # 'b' at pos 1 fails 
('a' != 'b'), backtrack
      RegExpJIT [6] NestedAlternativeNext index=1         # Try 'c(){2}$' at 
pos 1
      RegExpJIT [6] NestedAlternativeNext.bt index=2      # 'c' at pos 1 fails 
('a' != 'c'), backtrack
      RegExpJIT [1] ParenthesesSubpatternBegin.bt index=1 # All iter2 alts 
exhausted, try iter1

      # Now backtracking into iter1 (which matched 'a' at pos 0)
      RegExpJIT [2] NestedAlternativeBegin.bt index=2     # 'a' content 
backtrack at iter1's end (pos 1) - fails
      RegExpJIT [4] NestedAlternativeNext index=1         # Try 'b' at iter1's 
begin (pos 0)
      RegExpJIT [4] NestedAlternativeNext.bt index=2      # 'b' at pos 0 fails 
('a' != 'b')
      RegExpJIT [6] NestedAlternativeNext index=1         # Try 'c(){2}$' at 
pos 0
      RegExpJIT [6] NestedAlternativeNext.bt index=2      # 'c' at pos 0 fails 
('a' != 'c')
      RegExpJIT [1] ParenthesesSubpatternBegin.bt index=1 # All iter1 alts 
exhausted, no more contexts
      RegExpJIT [0] BodyAlternativeBegin.bt index=0       # Propagate failure, 
try next start position

      === TRY MATCH AT POSITION 1 ===

      RegExpJIT [0] BodyAlternativeBegin index=1          # Start matching at 
pos 1
      RegExpJIT [1] ParenthesesSubpatternBegin index=1    # Outer paren iter1: 
try at pos 1
      RegExpJIT [1] ParenthesesSubpatternBegin index=2    # iter2: try at pos 2 
('a' matched at 1)
      RegExpJIT [2] NestedAlternativeBegin.bt index=3     # 'a' at pos 2 fails 
('b' != 'a')
      RegExpJIT [4] NestedAlternativeNext index=2         # Try 'b' at pos 2
      RegExpJIT [14] ParenthesesSubpatternEnd index=3     # iter2 completed 
with 'b', now at pos 3
      RegExpJIT [14] ParenthesesSubpatternEnd.bt index=3  # '$' at pos 3 fails, 
backtrack
      RegExpJIT [13] NestedAlternativeEnd.bt index=3      # Jump to content 
backtrack
      RegExpJIT [4] NestedAlternativeNext.bt index=3      # 'b' content 
backtrack fails
      RegExpJIT [6] NestedAlternativeNext index=2         # Try 'c(){2}$' at 
pos 2
      RegExpJIT [6] NestedAlternativeNext.bt index=3      # 'c' at pos 2 fails 
('b' != 'c')
      RegExpJIT [1] ParenthesesSubpatternBegin.bt index=2 # All iter2 alts 
exhausted, try iter1

      # Backtrack iter1
      RegExpJIT [4] NestedAlternativeNext.bt index=3      # iter1 took 'a', 
content backtrack fails
      RegExpJIT [6] NestedAlternativeNext index=2         # Try 'c(){2}$' at 
iter1's begin (pos 1)
      RegExpJIT [6] NestedAlternativeNext.bt index=3      # 'c' at pos 1 fails 
('a' != 'c')
      RegExpJIT [1] ParenthesesSubpatternBegin.bt index=2 # Incomplete context 
skipped

      # Now at iter1's context from when it matched 'a' at pos 1
      RegExpJIT [2] NestedAlternativeBegin.bt index=2     # 'a' content 
backtrack at pos 2 fails
      RegExpJIT [4] NestedAlternativeNext index=1         # Try 'b' at pos 1
      RegExpJIT [4] NestedAlternativeNext.bt index=2      # 'b' at pos 1 fails 
('a' != 'b')
      RegExpJIT [6] NestedAlternativeNext index=1         # Try 'c' at pos 1
      RegExpJIT [6] NestedAlternativeNext.bt index=2      # 'c' at pos 1 fails
      RegExpJIT [1] ParenthesesSubpatternBegin.bt index=1 # All exhausted
      RegExpJIT [0] BodyAlternativeBegin.bt index=1       # Try next position

      === TRY MATCH AT POSITION 2 ===

      RegExpJIT [0] BodyAlternativeBegin index=2          # Start at pos 2 ('b')
      RegExpJIT [1] ParenthesesSubpatternBegin index=2    # iter1 at pos 2
      RegExpJIT [2] NestedAlternativeBegin.bt index=3     # 'a' fails at pos 2
      RegExpJIT [4] NestedAlternativeNext index=2         # Try 'b' at pos 2
      RegExpJIT [1] ParenthesesSubpatternBegin index=3    # iter1='b', iter2 at 
pos 3
      RegExpJIT [2] NestedAlternativeBegin.bt index=4     # 'a' fails at pos 3
      RegExpJIT [4] NestedAlternativeNext index=3         # Try 'b' at pos 3
      RegExpJIT [14] ParenthesesSubpatternEnd index=4     # iter2='b', 
completed at pos 4
      RegExpJIT [14] ParenthesesSubpatternEnd.bt index=4  # '$' fails at pos 4
      RegExpJIT [13] NestedAlternativeEnd.bt index=4
      RegExpJIT [4] NestedAlternativeNext.bt index=4
      RegExpJIT [6] NestedAlternativeNext index=3         # Try 'c' at pos 3
      RegExpJIT [6] NestedAlternativeNext.bt index=4      # 'c' fails ('b' != 
'c')
      RegExpJIT [1] ParenthesesSubpatternBegin.bt index=3 # iter2 exhausted
      RegExpJIT [4] NestedAlternativeNext.bt index=4      # iter1 backtrack
      RegExpJIT [6] NestedAlternativeNext index=3         # Try 'c' at iter1 
pos 2
      RegExpJIT [6] NestedAlternativeNext.bt index=4      # 'c' fails
      RegExpJIT [1] ParenthesesSubpatternBegin.bt index=3 # Skip incomplete
      RegExpJIT [4] NestedAlternativeNext.bt index=3      # iter1 'b' content 
backtrack
      RegExpJIT [6] NestedAlternativeNext index=2         # Try 'c' at pos 2
      RegExpJIT [6] NestedAlternativeNext.bt index=3      # 'c' fails
      RegExpJIT [1] ParenthesesSubpatternBegin.bt index=2 # All exhausted
      RegExpJIT [0] BodyAlternativeBegin.bt index=2

      === TRY MATCH AT POSITION 3 ===

      RegExpJIT [0] BodyAlternativeBegin index=3          # Start at pos 3 ('b')
      # ... similar pattern, eventually fails ...
      RegExpJIT [0] BodyAlternativeBegin.bt index=3

      === TRY MATCH AT POSITION 4 ===

      RegExpJIT [0] BodyAlternativeBegin index=4          # Start at pos 4 ('b')
      RegExpJIT [1] ParenthesesSubpatternBegin index=4
      RegExpJIT [2] NestedAlternativeBegin.bt index=5     # 'a' fails
      RegExpJIT [4] NestedAlternativeNext index=4         # Try 'b'
      RegExpJIT [1] ParenthesesSubpatternBegin index=5    # iter1='b', iter2 at 
pos 5
      RegExpJIT [2] NestedAlternativeBegin.bt index=6     # 'a' fails at pos 5
      RegExpJIT [4] NestedAlternativeNext index=5         # Try 'b' at pos 5
      RegExpJIT [4] NestedAlternativeNext.bt index=6      # 'b' fails ('c' != 
'b')
      RegExpJIT [6] NestedAlternativeNext index=5         # Try 'c(){2}$' at 
pos 5
      RegExpJIT [8] ParenthesesSubpatternBegin index=6    # Enter inner (){2} 
at pos 6...

Test: JSTests/stress/regexp-nested-quantified-parens-backtrack.js

* JSTests/stress/regexp-fixed-count-multi-alt-backtracking.js: Added.
(shouldBe):
(shouldBeArray):
(testDiffLen1):
(testDiffLen2):
(testDiffLen3):
(testDiffLen4):
(testDiffLen5):
(testThreeAlt1):
(testFourAlt1):
(testThreeAlt2):
(testThreeAlt3):
(testThreeIter1):
(testThreeIter2):
(testFourIter1):
(testFiveIter1):
(testInterIter1):
(testInterIter2):
(testPartial1):
(testPartial2):
(testGreedy1):
(testGreedy2):
(testGreedy3):
(testStar1):
(testNested1):
(testNested2):
(testSingleIter):
(testExactLong):
(testPosition):
* JSTests/stress/regexp-nested-quantified-parens-backtrack.js: Added.
(test):
* Source/JavaScriptCore/runtime/OptionsList.h:
* Source/JavaScriptCore/yarr/YarrJIT.cpp:

Canonical link: https://commits.webkit.org/307293@main



To unsubscribe from these emails, change your notification settings at 
https://github.com/WebKit/WebKit/settings/notifications

Reply via email to