Branch: refs/heads/main
Home: https://github.com/WebKit/WebKit
Commit: 270c824459cec4d19dab347a8db1526e0be50737
https://github.com/WebKit/WebKit/commit/270c824459cec4d19dab347a8db1526e0be50737
Author: Michael Saboff <[email protected]>
Date: 2023-03-03 (Fri, 03 Mar 2023)
Changed paths:
M JSTests/es6/Proxy_internal_get_calls_RegExp.prototype.flags.js
A JSTests/stress/regexp-vflag-property-of-strings.js
M JSTests/stress/static-getter-in-names.js
M JSTests/test262/config.yaml
M LayoutTests/js/Object-getOwnPropertyNames-expected.txt
M LayoutTests/js/script-tests/Object-getOwnPropertyNames.js
M Source/JavaScriptCore/JavaScriptCore.xcodeproj/project.pbxproj
M Source/JavaScriptCore/builtins/BuiltinNames.h
M Source/JavaScriptCore/builtins/RegExpPrototype.js
M Source/JavaScriptCore/builtins/StringPrototype.js
M Source/JavaScriptCore/bytecode/LinkTimeConstant.h
M Source/JavaScriptCore/dfg/DFGAbstractInterpreterInlines.h
M Source/JavaScriptCore/dfg/DFGFixupPhase.cpp
M Source/JavaScriptCore/dfg/DFGOperations.cpp
M Source/JavaScriptCore/dfg/DFGStrengthReductionPhase.cpp
M Source/JavaScriptCore/runtime/CachedTypes.cpp
M Source/JavaScriptCore/runtime/CommonIdentifiers.h
M Source/JavaScriptCore/runtime/JSGlobalObject.cpp
M Source/JavaScriptCore/runtime/JSGlobalObject.h
M Source/JavaScriptCore/runtime/JSGlobalObjectInlines.h
M Source/JavaScriptCore/runtime/RegExp.h
M Source/JavaScriptCore/runtime/RegExpCache.h
M Source/JavaScriptCore/runtime/RegExpObject.cpp
M Source/JavaScriptCore/runtime/RegExpPrototype.cpp
A Source/JavaScriptCore/ucd/emoji-sequences.txt
A Source/JavaScriptCore/ucd/emoji-zwj-sequences.txt
M Source/JavaScriptCore/yarr/Yarr.h
M Source/JavaScriptCore/yarr/YarrErrorCode.cpp
M Source/JavaScriptCore/yarr/YarrErrorCode.h
M Source/JavaScriptCore/yarr/YarrFlags.cpp
M Source/JavaScriptCore/yarr/YarrFlags.h
M Source/JavaScriptCore/yarr/YarrInterpreter.cpp
M Source/JavaScriptCore/yarr/YarrInterpreter.h
M Source/JavaScriptCore/yarr/YarrJIT.cpp
M Source/JavaScriptCore/yarr/YarrParser.h
M Source/JavaScriptCore/yarr/YarrPattern.cpp
M Source/JavaScriptCore/yarr/YarrPattern.h
M Source/JavaScriptCore/yarr/YarrSyntaxChecker.cpp
M Source/JavaScriptCore/yarr/YarrUnicodeProperties.cpp
M Source/JavaScriptCore/yarr/YarrUnicodeProperties.h
M Source/JavaScriptCore/yarr/generateYarrUnicodePropertyTables.py
M Source/WebCore/contentextensions/URLFilterParser.cpp
Log Message:
-----------
Implement RegExp `v` flag with set notation + properties of strings
https://bugs.webkit.org/show_bug.cgi?id=241593
rdar://100337109
Reviewed by Yusuke Suzuki.
This change implements the TC39 stage 3 proposal RegExp v flag with set
notation + properties of strings,
https://github.com/tc39/proposal-regexp-v-flag. It adds a new "unicodeSets"
compile mode for the Yarr engine.
Like the prior Unicode Yarr features, this change is driven by Unicode Database
Files (UCD).
This change includes two such new files,
JavaScriptCore/ucd/{emoji-sequences.txt & emoji-zwj-sequences.txt}.
The newly added properties include lists of strings. These strings are
processed via the character class syntax
through. When it comes to matching however, there is some desuguraing that
turns such a property of strings into
a list of alternations. For example, say a property has strings str1...strN
plus a traditional character class,
single-character-class, we create the pattern equivalent of:
(?:str1|str2|...|strN|[single-character-class])
Per the spec, longer strings appear earlier in the alternation, and before the
traditional character class.
This allows for searching for longer properties in a property list where
substrings of other strings are included
in that list.
There are new set of combining operators allowed in the class set character
classes. Two character class elements
that appear adjacent to each other implicitly have the Union combining
operations. There is also an Intersection
operation with the && operator and a Subtraction operation with the || operator.
There is new ClassSet parsing that follows new "cleaner" rules that traditional
character classes.
The prior ccharacter class constructor and delegates are mostly unchanged,
except for the compile mode
now being switched on an enum instead of a bool.
Added check that both 'u' and 'v' flags don't appear in the same RegExp.
Added unicodeSets getter watchpoint to the
m_regExpPrimordialPropertiesWatchpointSet.
* JSTests/es6/Proxy_internal_get_calls_RegExp.prototype.flags.js:
* JSTests/stress/regexp-vflag-property-of-strings.js: Added.
(arrayToString):
(objectToString):
(dumpValue):
(compareArray):
(compareGroups):
(testRegExp):
(testRegExpSyntaxError):
* JSTests/stress/static-getter-in-names.js:
* JSTests/test262/config.yaml:
* LayoutTests/js/Object-getOwnPropertyNames-expected.txt:
* LayoutTests/js/script-tests/Object-getOwnPropertyNames.js:
* Source/JavaScriptCore/JavaScriptCore.xcodeproj/project.pbxproj:
* Source/JavaScriptCore/builtins/BuiltinNames.h:
* Source/JavaScriptCore/builtins/RegExpPrototype.js:
(linkTimeConstant.hasObservableSideEffectsForRegExpMatch):
(linkTimeConstant.hasObservableSideEffectsForRegExpSplit):
(overriddenName.string_appeared_here.split):
* Source/JavaScriptCore/builtins/StringPrototype.js:
(linkTimeConstant.hasObservableSideEffectsForStringReplace):
* Source/JavaScriptCore/bytecode/LinkTimeConstant.h:
* Source/JavaScriptCore/dfg/DFGAbstractInterpreterInlines.h:
(JSC::DFG::AbstractInterpreter<AbstractStateType>::executeEffects):
* Source/JavaScriptCore/dfg/DFGFixupPhase.cpp:
(JSC::DFG::FixupPhase::addStringReplacePrimordialChecks):
* Source/JavaScriptCore/dfg/DFGOperations.cpp:
(JSC::DFG::JSC_DEFINE_JIT_OPERATION):
* Source/JavaScriptCore/dfg/DFGStrengthReductionPhase.cpp:
(JSC::DFG::StrengthReductionPhase::handleNode):
* Source/JavaScriptCore/runtime/CachedTypes.cpp:
* Source/JavaScriptCore/runtime/CommonIdentifiers.h:
* Source/JavaScriptCore/runtime/JSGlobalObject.cpp:
(JSC::JSGlobalObject::init):
* Source/JavaScriptCore/runtime/JSGlobalObject.h:
* Source/JavaScriptCore/runtime/JSGlobalObjectInlines.h:
(JSC::JSGlobalObject::regExpProtoUnicodeSetsGetter const):
* Source/JavaScriptCore/runtime/RegExp.h:
* Source/JavaScriptCore/runtime/RegExpCache.h:
* Source/JavaScriptCore/runtime/RegExpObject.cpp:
(JSC::RegExpObject::matchGlobal):
* Source/JavaScriptCore/runtime/RegExpPrototype.cpp:
(JSC::RegExpPrototype::finishCreation):
(JSC::JSC_DEFINE_HOST_FUNCTION):
* Source/JavaScriptCore/ucd/emoji-sequences.txt: Added.
* Source/JavaScriptCore/ucd/emoji-zwj-sequences.txt: Added.
* Source/JavaScriptCore/yarr/Yarr.h:
* Source/JavaScriptCore/yarr/YarrErrorCode.cpp:
(JSC::Yarr::errorMessage):
(JSC::Yarr::errorToThrow):
* Source/JavaScriptCore/yarr/YarrErrorCode.h:
* Source/JavaScriptCore/yarr/YarrFlags.h:
* Source/JavaScriptCore/yarr/YarrInterpreter.cpp:
(JSC::Yarr::ByteTermDumper::ByteTermDumper):
(JSC::Yarr::ByteTermDumper::unicode const):
(JSC::Yarr::ByteTermDumper::unicodeSets const):
(JSC::Yarr::ByteTermDumper::eitherUnicode const):
(JSC::Yarr::Interpreter::tryConsumeBackReference):
(JSC::Yarr::Interpreter::matchCharacterClass):
(JSC::Yarr::Interpreter::backtrackCharacterClass):
(JSC::Yarr::Interpreter::matchDisjunction):
(JSC::Yarr::Interpreter::Interpreter):
(JSC::Yarr::Interpreter::isLegacyCompilation const):
(JSC::Yarr::Interpreter::isUnicodeCompilation const):
(JSC::Yarr::Interpreter::isUnicodeSetsCompilation const):
(JSC::Yarr::Interpreter::isEitherUnicodeCompilation const):
(JSC::Yarr::ByteTermDumper::dumpTerm):
(JSC::Yarr::ByteTermDumper::unicode): Deleted.
* Source/JavaScriptCore/yarr/YarrInterpreter.h:
(JSC::Yarr::BytecodePattern::BytecodePattern):
(JSC::Yarr::BytecodePattern::compileMode const):
(JSC::Yarr::BytecodePattern::unicodeSets const):
(JSC::Yarr::BytecodePattern::eitherUnicode const):
* Source/JavaScriptCore/yarr/YarrJIT.cpp:
* Source/JavaScriptCore/yarr/YarrParser.h:
(JSC::Yarr::Parser::CharacterClassParserDelegate::CharacterClassParserDelegate):
(JSC::Yarr::Parser::ClassSetParserDelegate::ClassSetParserDelegate):
(JSC::Yarr::Parser::ClassSetParserDelegate::begin):
(JSC::Yarr::Parser::ClassSetParserDelegate::nestedClassBegin):
(JSC::Yarr::Parser::ClassSetParserDelegate::doneAfterCharacterClassEnd):
(JSC::Yarr::Parser::ClassSetParserDelegate::setUnionOp):
(JSC::Yarr::Parser::ClassSetParserDelegate::setSubtractOp):
(JSC::Yarr::Parser::ClassSetParserDelegate::setIntersectionOp):
(JSC::Yarr::Parser::ClassSetParserDelegate::flushCachedCharacterIfNeeded):
(JSC::Yarr::Parser::ClassSetParserDelegate::atomPatternCharacter):
(JSC::Yarr::Parser::ClassSetParserDelegate::atomBuiltInCharacterClass):
(JSC::Yarr::Parser::ClassSetParserDelegate::end):
(JSC::Yarr::Parser::ClassSetParserDelegate::error):
(JSC::Yarr::Parser::ClassSetParserDelegate::assertionWordBoundary):
(JSC::Yarr::Parser::ClassSetParserDelegate::atomBackReference):
(JSC::Yarr::Parser::ClassSetParserDelegate::atomNamedBackReference):
(JSC::Yarr::Parser::ClassSetParserDelegate::atomNamedForwardReference):
(JSC::Yarr::Parser::ClassStringDisjunctionParserDelegate::ClassStringDisjunctionParserDelegate):
(JSC::Yarr::Parser::ClassStringDisjunctionParserDelegate::atomPatternCharacter):
(JSC::Yarr::Parser::ClassStringDisjunctionParserDelegate::newAlternative):
(JSC::Yarr::Parser::ClassStringDisjunctionParserDelegate::end):
(JSC::Yarr::Parser::ClassStringDisjunctionParserDelegate::assertionWordBoundary):
(JSC::Yarr::Parser::ClassStringDisjunctionParserDelegate::atomBackReference):
(JSC::Yarr::Parser::ClassStringDisjunctionParserDelegate::atomNamedBackReference):
(JSC::Yarr::Parser::ClassStringDisjunctionParserDelegate::atomNamedForwardReference):
(JSC::Yarr::Parser::ClassStringDisjunctionParserDelegate::atomBuiltInCharacterClass):
(JSC::Yarr::Parser::Parser):
(JSC::Yarr::Parser::isIdentityEscapeAnError):
(JSC::Yarr::Parser::parseEscape):
(JSC::Yarr::Parser::consumePossibleSurrogatePair):
(JSC::Yarr::Parser::parseAtomEscape):
(JSC::Yarr::Parser::parseCharacterClassEscape):
(JSC::Yarr::Parser::parseClassSetEscape):
(JSC::Yarr::Parser::parseClassStringDisjunctionEscape):
(JSC::Yarr::Parser::parseCharacterClass):
(JSC::Yarr::Parser::parseClassSet):
(JSC::Yarr::Parser::parseClassStringDisjunction):
(JSC::Yarr::Parser::parseParenthesesEnd):
(JSC::Yarr::Parser::parseTokens):
(JSC::Yarr::Parser::handleIllegalReferences):
(JSC::Yarr::Parser::tryConsumeUnicodeEscape):
(JSC::Yarr::Parser::tryConsumeUnicodePropertyExpression):
(JSC::Yarr::Parser::isLegacyCompilation const):
(JSC::Yarr::Parser::isUnicodeCompilation const):
(JSC::Yarr::Parser::isUnicodeSetsCompilation const):
(JSC::Yarr::Parser::isEitherUnicodeCompilation const):
(JSC::Yarr::compileMode):
(JSC::Yarr::parse):
* Source/JavaScriptCore/yarr/YarrPattern.cpp:
(JSC::Yarr::CharacterClassConstructor::CharacterClassConstructor):
(JSC::Yarr::CharacterClassConstructor::reset):
(JSC::Yarr::CharacterClassConstructor::combiningSetOp):
(JSC::Yarr::CharacterClassConstructor::append):
(JSC::Yarr::CharacterClassConstructor::appendInverted):
(JSC::Yarr::CharacterClassConstructor::putRange):
(JSC::Yarr::CharacterClassConstructor::atomClassStringDisjunction):
(JSC::Yarr::CharacterClassConstructor::performSetOpWith):
(JSC::Yarr::CharacterClassConstructor::performSetOpWithStrings):
(JSC::Yarr::CharacterClassConstructor::performSetOpWithMatches):
(JSC::Yarr::CharacterClassConstructor::hasInverteStrings):
(JSC::Yarr::CharacterClassConstructor::compareUTF32Strings):
(JSC::Yarr::CharacterClassConstructor::sort):
(JSC::Yarr::CharacterClassConstructor::charClass):
(JSC::Yarr::CharacterClassConstructor::mergeRangesFrom):
(JSC::Yarr::CharacterClassConstructor::unionStrings):
(JSC::Yarr::CharacterClassConstructor::intersectionStrings):
(JSC::Yarr::CharacterClassConstructor::subtractionStrings):
(JSC::Yarr::CharacterClassConstructor::asciiOpSorted):
(JSC::Yarr::CharacterClassConstructor::unicodeOpSorted):
(JSC::Yarr::YarrPatternConstructor::YarrPatternConstructor):
(JSC::Yarr::YarrPatternConstructor::resetForReparsing):
(JSC::Yarr::YarrPatternConstructor::atomPatternCharacter):
(JSC::Yarr::YarrPatternConstructor::atomBuiltInCharacterClass):
(JSC::Yarr::YarrPatternConstructor::atomCharacterClassAtom):
(JSC::Yarr::YarrPatternConstructor::atomCharacterClassRange):
(JSC::Yarr::YarrPatternConstructor::atomCharacterClassBuiltIn):
(JSC::Yarr::YarrPatternConstructor::atomClassStringDisjunction):
(JSC::Yarr::YarrPatternConstructor::atomCharacterClassSetOp):
(JSC::Yarr::YarrPatternConstructor::atomCharacterClassPushNested):
(JSC::Yarr::YarrPatternConstructor::atomCharacterClassPopNested):
(JSC::Yarr::YarrPatternConstructor::atomCharacterClassEnd):
(JSC::Yarr::YarrPattern::compile):
(JSC::Yarr::PatternTerm::dump):
(JSC::Yarr::YarrPattern::dumpPatternString):
(JSC::Yarr::YarrPattern::dumpPattern):
* Source/JavaScriptCore/yarr/YarrPattern.h:
(JSC::Yarr::CharacterClass::CharacterClass):
(JSC::Yarr::CharacterClass::hasNonBMPCharacters const):
(JSC::Yarr::CharacterClass::hasOneCharacterSize const):
(JSC::Yarr::CharacterClass::hasOnlyNonBMPCharacters const):
(JSC::Yarr::CharacterClass::hasStrings const):
(JSC::Yarr::CharacterClass::hasSingleCharacters const):
(JSC::Yarr::ClassSet::ClassSet):
(JSC::Yarr::YarrPattern::unicodeSets const):
(JSC::Yarr::YarrPattern::eitherUnicode const):
(JSC::Yarr::YarrPattern::compileMode const):
(JSC::Yarr::CharacterClass::hasNonBMPCharacters): Deleted.
(JSC::Yarr::CharacterClass::hasOneCharacterSize): Deleted.
(JSC::Yarr::CharacterClass::hasOnlyNonBMPCharacters): Deleted.
* Source/JavaScriptCore/yarr/YarrSyntaxChecker.cpp:
(JSC::Yarr::SyntaxChecker::atomClassStringDisjunction):
(JSC::Yarr::SyntaxChecker::atomCharacterClassSetOp):
(JSC::Yarr::SyntaxChecker::atomCharacterClassPushNested):
(JSC::Yarr::SyntaxChecker::atomCharacterClassPopNested):
(JSC::Yarr::checkSyntax):
* Source/JavaScriptCore/yarr/YarrUnicodeProperties.cpp:
(JSC::Yarr::unicodeMatchProperty):
(JSC::Yarr::createUnicodeCharacterClassFor):
(JSC::Yarr::characterClassMayContainStrings):
* Source/JavaScriptCore/yarr/YarrUnicodeProperties.h:
* Source/JavaScriptCore/yarr/generateYarrUnicodePropertyTables.py:
(PropertyData.__init__):
(PropertyData.addMatchString):
(PropertyData.stringsCompare):
(PropertyData):
(PropertyData.sortStrings):
(PropertyData.dumpMatchData):
(PropertyData.convertStringToCppFormat):
(PropertyData.dump):
(PropertyData.dumpAll):
(PropertyData.dumpMayContainStringFunc):
(BinaryProperty.dump):
(SequenceProperty):
(SequenceProperty.__init__):
(SequenceProperty.parsePropertyFile):
(SequenceProperty.dump):
* Source/WebCore/contentextensions/URLFilterParser.cpp:
(WebCore::ContentExtensions::PatternParser::atomClassStringDisjunction):
(WebCore::ContentExtensions::PatternParser::atomCharacterClassSetOp):
(WebCore::ContentExtensions::PatternParser::atomCharacterClassPushNested):
(WebCore::ContentExtensions::PatternParser::atomCharacterClassPopNested):
(WebCore::ContentExtensions::URLFilterParser::addPattern):
Canonical link: https://commits.webkit.org/261188@main
_______________________________________________
webkit-changes mailing list
[email protected]
https://lists.webkit.org/mailman/listinfo/webkit-changes