Branch: refs/heads/main
Home: https://github.com/WebKit/WebKit
Commit: c8b66aa0832bb531bdc09889d1c483a9c02d15fd
https://github.com/WebKit/WebKit/commit/c8b66aa0832bb531bdc09889d1c483a9c02d15fd
Author: Yusuke Suzuki <[email protected]>
Date: 2026-01-31 (Sat, 31 Jan 2026)
Changed paths:
M JSTests/stress/regexp-fixed-count-parentheses-unroll.js
A JSTests/stress/regexp-fixedcount-backtrack-alternatives.js
A JSTests/stress/regexp-yarr-jit-fixed-count-capture.js
M Source/JavaScriptCore/yarr/Yarr.h
M Source/JavaScriptCore/yarr/YarrJIT.cpp
M Source/JavaScriptCore/yarr/YarrPattern.h
Log Message:
-----------
[JSC] YarrJIT should support fixed-count subpatterns with capture
https://bugs.webkit.org/show_bug.cgi?id=306582
rdar://169224910
Reviewed by Sosuke Suzuki.
This patch adds Yarr JIT support for FixedCount subpatterns.
For each iteration of subpattern, we need to capture the current state
since the next iteration will override the capture information.
Consider /(a+){2}b/ matching "aaab":
Iteration 1: (a+) greedily matches "aaa" → capture group 1 = "aaa"
Iteration 2: (a+) needs at least one 'a', but only 'b' remains → FAIL
Backtrack to iteration 1: try (a+) = "aa" instead
Iteration 2: (a+) = "a" → capture group 1 = "a"
Match 'b' → SUCCESS
Result: ["aaab", "a"]
When iteration 2 fails, we need to:
1. Restore iteration 1's starting state (position, captures before iteration 1
modified them)
2. Re-try iteration 1 with different choices
We store these information to ParenContext to restore when it fails.
The implementation works in this way, let's have /(x){3}/ matching "xxx":
Initial: head = null, matchAmount = 0, index = 0
━━━ Iteration 1 starts ━━━
At m_reentry: index=0, matchAmount=0, captures=undefined
Allocate context1, save: {begin=0, matchAmount=0, captures=undefined}
Push: head → context1
Match 'x', capture = "x"
END: matchAmount = 1, loop back
━━━ Iteration 2 starts ━━━
At m_reentry: index=1, matchAmount=1, captures="x"
Allocate context2, save: {begin=1, matchAmount=1, captures="x"}
Push: head → context2 → context1
Match 'x', capture = "x"
END: matchAmount = 2, loop back
━━━ Iteration 3 starts ━━━
At m_reentry: index=2, matchAmount=2, captures="x"
Allocate context3, save: {begin=2, matchAmount=2, captures="x"}
Push: head → context3 → context2 → context1
Match 'x', capture = "x"
END: matchAmount = 3, done
So after this subpatterns part, frame is keeping context3.
When backtrack happens after this, we restore the already modified state
with context3 and popping it. This is the state before the 3rd iteration
starts. And we jump back to the END's backtrack code. And we continue
doing the same thing, and entire match fails.
Right now, we are not able to JIT for /(a+){3}/ because of the above.
When we backtrack, we do 2nd iteration's backtrack instead of 3rd's a+
part. So we cannot support this thing with the current form (saving and
restoring state at a boundary of (). This should be extended later.
* JSTests/stress/regexp-fixed-count-parentheses-unroll.js:
(testOneToN):
* JSTests/stress/regexp-fixedcount-backtrack-alternatives.js: Added.
(shouldBe):
(re):
* JSTests/stress/regexp-yarr-jit-fixed-count-capture.js: Added.
(shouldBe):
(re.From.s.S.s):
(re):
* Source/JavaScriptCore/yarr/Yarr.h:
* Source/JavaScriptCore/yarr/YarrJIT.cpp:
* Source/JavaScriptCore/yarr/YarrPattern.h:
(JSC::Yarr::BackTrackInfoParenthesesFixedCount::beginIndex): Deleted.
(JSC::Yarr::BackTrackInfoParenthesesFixedCount::matchAmountIndex): Deleted.
Canonical link: https://commits.webkit.org/306582@main
To unsubscribe from these emails, change your notification settings at
https://github.com/WebKit/WebKit/settings/notifications