Branch: refs/heads/main
Home: https://github.com/WebKit/WebKit
Commit: 34b0b047bb64f93ccd1b003d410e0f8b4c9d681b
https://github.com/WebKit/WebKit/commit/34b0b047bb64f93ccd1b003d410e0f8b4c9d681b
Author: David Degazio <[email protected]>
Date: 2024-06-27 (Thu, 27 Jun 2024)
Changed paths:
A JSTests/microbenchmarks/regexp-match-alphanumeric.js
A JSTests/microbenchmarks/regexp-match-multiple-single-chars.js
A JSTests/microbenchmarks/regexp-match-separators.js
M Source/JavaScriptCore/assembler/MacroAssembler.h
M Source/JavaScriptCore/assembler/MacroAssemblerARM64.h
M Source/JavaScriptCore/assembler/MacroAssemblerARMv7.h
M Source/JavaScriptCore/assembler/MacroAssemblerRISCV64.h
M Source/JavaScriptCore/assembler/MacroAssemblerX86Common.h
M Source/JavaScriptCore/assembler/MacroAssemblerX86_64.h
M Source/JavaScriptCore/yarr/YarrJIT.cpp
Log Message:
-----------
[JSC] Use immediate bit-vectors for character class matching in YarrJIT
https://bugs.webkit.org/show_bug.cgi?id=275279
rdar://129419939
Reviewed by Michael Saboff.
Changes how YarrJIT handles character class matches via the following:
1. Optimize single-range checks from two branches into subtract + branch.
2. Use a bit-vector test to quickly match a set of individual characters,
as opposed to the current strategy of O(n) sequential equality checks.
3. Make the logic of matchCharacterClassRange more recursive. We use the
optimized single-range test if there is only a single range, and use
the new bit-vector test if the whole set of ranges and character matches
fits within a small-enough range. Moreover, the binary search is now
totally recursive, meaning we can use these specialized checks for
recursive checks within the binary search too, whereas currently binary
search is kind of all-or-nothing.
4. A few small optimizations are removed - YarrJIT no longer special-cases
ASCII letters in character class matches, since character set matching
is now faster. Turning adjacent character matches into length-two ranges
is also removed during CharacterClass construction since this doesn't
really do anything other than make the binary search do extra work (I'd
be really surprised if this was ever particularly profitable).
Overall, this seems to be a somewhat modest but appreciable perf win on
microbenchmarks. On the added ASCII alphanumeric test I'm seeing about 10%
improvement with this new approach, and on the single-chars test I'm seeing
more like 20% improvement. I've added a test for a set of separator chars
too, and we have maybe a small ~2% improvement on my machine - this is pretty
small and hopefully improvable? Not so exciting, but let's have the
microbenchmark in the tree anyway.
* JSTests/microbenchmarks/regexp-match-alphanumeric.js: Added.
* JSTests/microbenchmarks/regexp-match-multiple-single-chars.js: Added.
* JSTests/microbenchmarks/regexp-match-separators.js: Added.
(let.src):
(dot):
(test):
(i.let.re):
* Source/JavaScriptCore/yarr/YarrJIT.cpp:
* Source/JavaScriptCore/yarr/YarrPattern.cpp:
(JSC::Yarr::CharacterClassConstructor::addSorted):
Canonical link: https://commits.webkit.org/280425@main
To unsubscribe from these emails, change your notification settings at
https://github.com/WebKit/WebKit/settings/notifications
_______________________________________________
webkit-changes mailing list
[email protected]
https://lists.webkit.org/mailman/listinfo/webkit-changes