Branch: refs/heads/main
Home: https://github.com/WebKit/WebKit
Commit: bd24d5579c471255533ccd95390dd2176539a206
https://github.com/WebKit/WebKit/commit/bd24d5579c471255533ccd95390dd2176539a206
Author: Yusuke Suzuki <[email protected]>
Date: 2026-06-01 (Mon, 01 Jun 2026)
Changed paths:
A JSTests/stress/regexp-character-class-latin1-boundary.js
M Source/JavaScriptCore/yarr/YarrInterpreter.cpp
M Source/JavaScriptCore/yarr/YarrJIT.cpp
M Source/JavaScriptCore/yarr/YarrPattern.cpp
M Source/JavaScriptCore/yarr/YarrPattern.h
M Source/JavaScriptCore/yarr/create_regex_tables
M Source/JavaScriptCore/yarr/generateYarrUnicodePropertyTables.py
Log Message:
-----------
[Yarr] Change Yarr m_matches / m_ranges / m_matchesUnicode / m_rangesUnicode
ranges
https://bugs.webkit.org/show_bug.cgi?id=315926
rdar://178330634
Reviewed by Yijia Huang.
Yarr was splitting m_matches / m_ranges / m_matchesUnicode / m_rangesUnicode
ranges
previously with ASCII v.s. non-ASCII. But given that we are having Char8
/ Char16, it is more efficient and natural to split them with Latin1
v.s. non-Latin1. This patch changes this. Also we improve addSorted /
addSortedRange for edge cases: we carefully avoid adding singleton range
(only one element) for boundary char code. Also, based on the above
boundary, we rename them to m_matches8, m_ranges8, m_matches32, m_ranges32.
Test: JSTests/stress/regexp-character-class-latin1-boundary.js
* JSTests/stress/regexp-character-class-latin1-boundary.js: Added.
(shouldBe):
(repeat):
* Source/JavaScriptCore/yarr/YarrInterpreter.cpp:
(JSC::Yarr::Interpreter::testCharacterClass):
* Source/JavaScriptCore/yarr/YarrJIT.cpp:
(JSC::Yarr::MaskedAlternativeInfo::computeMaskForCharacterClass):
* Source/JavaScriptCore/yarr/YarrPattern.cpp:
(JSC::Yarr::CharacterClassConstructor::reset):
(JSC::Yarr::CharacterClassConstructor::append):
(JSC::Yarr::CharacterClassConstructor::appendInverted):
(JSC::Yarr::CharacterClassConstructor::putChar):
(JSC::Yarr::CharacterClassConstructor::putCharNonUnion):
(JSC::Yarr::CharacterClassConstructor::putRange):
(JSC::Yarr::CharacterClassConstructor::atomClassStringDisjunction):
(JSC::Yarr::CharacterClassConstructor::invertMatches):
(JSC::Yarr::CharacterClassConstructor::performSetOpWith):
(JSC::Yarr::CharacterClassConstructor::performSetOpWithMatches):
(JSC::Yarr::CharacterClassConstructor::charClass):
(JSC::Yarr::CharacterClassConstructor::addSorted):
(JSC::Yarr::CharacterClassConstructor::addSortedRange):
(JSC::Yarr::CharacterClassConstructor::latin1Op):
(JSC::Yarr::CharacterClassConstructor::latin1Invert):
(JSC::Yarr::CharacterClassConstructor::nonLatin1OpSorted):
(JSC::Yarr::CharacterClassConstructor::nonLatin1Invert):
(JSC::Yarr::CharacterClassConstructor::coalesceTables):
(JSC::Yarr::dumpCharacterClass):
(JSC::Yarr::anycharCreate):
(JSC::Yarr::CharacterClass::hasSharedLeadSurrogate const):
(JSC::Yarr::CharacterClassConstructor::asciiOp): Deleted.
(JSC::Yarr::CharacterClassConstructor::asciiInvert): Deleted.
(JSC::Yarr::CharacterClassConstructor::unicodeOpSorted): Deleted.
(JSC::Yarr::CharacterClassConstructor::unicodeInvert): Deleted.
(JSC::Yarr::CharacterClass::copyOnly8BitCharacterData): Deleted.
* Source/JavaScriptCore/yarr/YarrPattern.h:
(JSC::Yarr::CharacterClass::CharacterClass):
(JSC::Yarr::CharacterClass::hasSingleCharacters const):
(JSC::Yarr::ClassSet::ClassSet):
* Source/JavaScriptCore/yarr/create_regex_tables:
(in):
* Source/JavaScriptCore/yarr/generateYarrUnicodePropertyTables.py:
(PropertyData.addMatch):
(PropertyData.addRange):
(PropertyData.addMatchUnordered):
(PropertyData.addRangeUnordered):
(PropertyData.removeMatch):
Canonical link: https://commits.webkit.org/314305@main
To unsubscribe from these emails, change your notification settings at
https://github.com/WebKit/WebKit/settings/notifications