Branch: refs/heads/main
  Home:   https://github.com/WebKit/WebKit
  Commit: c8b66aa0832bb531bdc09889d1c483a9c02d15fd
      
https://github.com/WebKit/WebKit/commit/c8b66aa0832bb531bdc09889d1c483a9c02d15fd
  Author: Yusuke Suzuki <[email protected]>
  Date:   2026-01-31 (Sat, 31 Jan 2026)

  Changed paths:
    M JSTests/stress/regexp-fixed-count-parentheses-unroll.js
    A JSTests/stress/regexp-fixedcount-backtrack-alternatives.js
    A JSTests/stress/regexp-yarr-jit-fixed-count-capture.js
    M Source/JavaScriptCore/yarr/Yarr.h
    M Source/JavaScriptCore/yarr/YarrJIT.cpp
    M Source/JavaScriptCore/yarr/YarrPattern.h

  Log Message:
  -----------
  [JSC] YarrJIT should support fixed-count subpatterns with capture
https://bugs.webkit.org/show_bug.cgi?id=306582
rdar://169224910

Reviewed by Sosuke Suzuki.

This patch adds Yarr JIT support for FixedCount subpatterns.
For each iteration of subpattern, we need to capture the current state
since the next iteration will override the capture information.
Consider /(a+){2}b/ matching "aaab":

    Iteration 1: (a+) greedily matches "aaa" → capture group 1 = "aaa"
    Iteration 2: (a+) needs at least one 'a', but only 'b' remains → FAIL
    Backtrack to iteration 1: try (a+) = "aa" instead
    Iteration 2: (a+) = "a" → capture group 1 = "a"
    Match 'b' → SUCCESS
    Result: ["aaab", "a"]

When iteration 2 fails, we need to:
1. Restore iteration 1's starting state (position, captures before iteration 1 
modified them)
2. Re-try iteration 1 with different choices

We store these information to ParenContext to restore when it fails.
The implementation works in this way, let's have /(x){3}/ matching "xxx":

    Initial: head = null, matchAmount = 0, index = 0

    ━━━ Iteration 1 starts ━━━
        At m_reentry: index=0, matchAmount=0, captures=undefined
        Allocate context1, save: {begin=0, matchAmount=0, captures=undefined}
        Push: head → context1
        Match 'x', capture = "x"
        END: matchAmount = 1, loop back

    ━━━ Iteration 2 starts ━━━
        At m_reentry: index=1, matchAmount=1, captures="x"
        Allocate context2, save: {begin=1, matchAmount=1, captures="x"}
        Push: head → context2 → context1
        Match 'x', capture = "x"
        END: matchAmount = 2, loop back

    ━━━ Iteration 3 starts ━━━
        At m_reentry: index=2, matchAmount=2, captures="x"
        Allocate context3, save: {begin=2, matchAmount=2, captures="x"}
        Push: head → context3 → context2 → context1
        Match 'x', capture = "x"
        END: matchAmount = 3, done

So after this subpatterns part, frame is keeping context3.
When backtrack happens after this, we restore the already modified state
with context3 and popping it. This is the state before the 3rd iteration
starts. And we jump back to the END's backtrack code. And we continue
doing the same thing, and entire match fails.

Right now, we are not able to JIT for /(a+){3}/ because of the above.
When we backtrack, we do 2nd iteration's backtrack instead of 3rd's a+
part. So we cannot support this thing with the current form (saving and
restoring state at a boundary of (). This should be extended later.

* JSTests/stress/regexp-fixed-count-parentheses-unroll.js:
(testOneToN):
* JSTests/stress/regexp-fixedcount-backtrack-alternatives.js: Added.
(shouldBe):
(re):
* JSTests/stress/regexp-yarr-jit-fixed-count-capture.js: Added.
(shouldBe):
(re.From.s.S.s):
(re):
* Source/JavaScriptCore/yarr/Yarr.h:
* Source/JavaScriptCore/yarr/YarrJIT.cpp:
* Source/JavaScriptCore/yarr/YarrPattern.h:
(JSC::Yarr::BackTrackInfoParenthesesFixedCount::beginIndex): Deleted.
(JSC::Yarr::BackTrackInfoParenthesesFixedCount::matchAmountIndex): Deleted.

Canonical link: https://commits.webkit.org/306582@main



To unsubscribe from these emails, change your notification settings at 
https://github.com/WebKit/WebKit/settings/notifications

Reply via email to