Branch: refs/heads/main
  Home:   https://github.com/WebKit/WebKit
  Commit: 270c824459cec4d19dab347a8db1526e0be50737
      
https://github.com/WebKit/WebKit/commit/270c824459cec4d19dab347a8db1526e0be50737
  Author: Michael Saboff <[email protected]>
  Date:   2023-03-03 (Fri, 03 Mar 2023)

  Changed paths:
    M JSTests/es6/Proxy_internal_get_calls_RegExp.prototype.flags.js
    A JSTests/stress/regexp-vflag-property-of-strings.js
    M JSTests/stress/static-getter-in-names.js
    M JSTests/test262/config.yaml
    M LayoutTests/js/Object-getOwnPropertyNames-expected.txt
    M LayoutTests/js/script-tests/Object-getOwnPropertyNames.js
    M Source/JavaScriptCore/JavaScriptCore.xcodeproj/project.pbxproj
    M Source/JavaScriptCore/builtins/BuiltinNames.h
    M Source/JavaScriptCore/builtins/RegExpPrototype.js
    M Source/JavaScriptCore/builtins/StringPrototype.js
    M Source/JavaScriptCore/bytecode/LinkTimeConstant.h
    M Source/JavaScriptCore/dfg/DFGAbstractInterpreterInlines.h
    M Source/JavaScriptCore/dfg/DFGFixupPhase.cpp
    M Source/JavaScriptCore/dfg/DFGOperations.cpp
    M Source/JavaScriptCore/dfg/DFGStrengthReductionPhase.cpp
    M Source/JavaScriptCore/runtime/CachedTypes.cpp
    M Source/JavaScriptCore/runtime/CommonIdentifiers.h
    M Source/JavaScriptCore/runtime/JSGlobalObject.cpp
    M Source/JavaScriptCore/runtime/JSGlobalObject.h
    M Source/JavaScriptCore/runtime/JSGlobalObjectInlines.h
    M Source/JavaScriptCore/runtime/RegExp.h
    M Source/JavaScriptCore/runtime/RegExpCache.h
    M Source/JavaScriptCore/runtime/RegExpObject.cpp
    M Source/JavaScriptCore/runtime/RegExpPrototype.cpp
    A Source/JavaScriptCore/ucd/emoji-sequences.txt
    A Source/JavaScriptCore/ucd/emoji-zwj-sequences.txt
    M Source/JavaScriptCore/yarr/Yarr.h
    M Source/JavaScriptCore/yarr/YarrErrorCode.cpp
    M Source/JavaScriptCore/yarr/YarrErrorCode.h
    M Source/JavaScriptCore/yarr/YarrFlags.cpp
    M Source/JavaScriptCore/yarr/YarrFlags.h
    M Source/JavaScriptCore/yarr/YarrInterpreter.cpp
    M Source/JavaScriptCore/yarr/YarrInterpreter.h
    M Source/JavaScriptCore/yarr/YarrJIT.cpp
    M Source/JavaScriptCore/yarr/YarrParser.h
    M Source/JavaScriptCore/yarr/YarrPattern.cpp
    M Source/JavaScriptCore/yarr/YarrPattern.h
    M Source/JavaScriptCore/yarr/YarrSyntaxChecker.cpp
    M Source/JavaScriptCore/yarr/YarrUnicodeProperties.cpp
    M Source/JavaScriptCore/yarr/YarrUnicodeProperties.h
    M Source/JavaScriptCore/yarr/generateYarrUnicodePropertyTables.py
    M Source/WebCore/contentextensions/URLFilterParser.cpp

  Log Message:
  -----------
  Implement RegExp `v` flag with set notation + properties of strings
https://bugs.webkit.org/show_bug.cgi?id=241593
rdar://100337109

Reviewed by Yusuke Suzuki.

This change implements the TC39 stage 3 proposal RegExp v flag with set 
notation + properties of strings,
https://github.com/tc39/proposal-regexp-v-flag.  It adds a new "unicodeSets" 
compile mode for the Yarr engine.
Like the prior Unicode Yarr features, this change is driven by Unicode Database 
Files (UCD).
This change includes two such new files, 
JavaScriptCore/ucd/{emoji-sequences.txt & emoji-zwj-sequences.txt}.

The newly added properties include lists of strings.  These strings are 
processed via the character class syntax
through.  When it comes to matching however, there is some desuguraing that 
turns such a property of strings into
a list of alternations.  For example, say a property has strings str1...strN 
plus a traditional character class,
single-character-class, we create the pattern equivalent of:
     (?:str1|str2|...|strN|[single-character-class])
Per the spec, longer strings appear earlier in the alternation, and before the 
traditional character class.
This allows for searching for longer properties in a property list where 
substrings of other strings are included
in that list.

There are new set of combining operators allowed in the class set character 
classes.  Two character class elements
that appear adjacent to each other implicitly have the Union combining 
operations.  There is also an Intersection
operation with the && operator and a Subtraction operation with the || operator.

There is new ClassSet parsing that follows new "cleaner" rules that traditional 
character classes.
The prior ccharacter class constructor and delegates are mostly unchanged, 
except for the compile mode
now being switched on an enum instead of a bool.

Added check that both 'u' and 'v' flags don't appear in the same RegExp.

Added unicodeSets getter watchpoint to the 
m_regExpPrimordialPropertiesWatchpointSet.

* JSTests/es6/Proxy_internal_get_calls_RegExp.prototype.flags.js:
* JSTests/stress/regexp-vflag-property-of-strings.js: Added.
(arrayToString):
(objectToString):
(dumpValue):
(compareArray):
(compareGroups):
(testRegExp):
(testRegExpSyntaxError):
* JSTests/stress/static-getter-in-names.js:
* JSTests/test262/config.yaml:
* LayoutTests/js/Object-getOwnPropertyNames-expected.txt:
* LayoutTests/js/script-tests/Object-getOwnPropertyNames.js:
* Source/JavaScriptCore/JavaScriptCore.xcodeproj/project.pbxproj:
* Source/JavaScriptCore/builtins/BuiltinNames.h:
* Source/JavaScriptCore/builtins/RegExpPrototype.js:
(linkTimeConstant.hasObservableSideEffectsForRegExpMatch):
(linkTimeConstant.hasObservableSideEffectsForRegExpSplit):
(overriddenName.string_appeared_here.split):
* Source/JavaScriptCore/builtins/StringPrototype.js:
(linkTimeConstant.hasObservableSideEffectsForStringReplace):
* Source/JavaScriptCore/bytecode/LinkTimeConstant.h:
* Source/JavaScriptCore/dfg/DFGAbstractInterpreterInlines.h:
(JSC::DFG::AbstractInterpreter<AbstractStateType>::executeEffects):
* Source/JavaScriptCore/dfg/DFGFixupPhase.cpp:
(JSC::DFG::FixupPhase::addStringReplacePrimordialChecks):
* Source/JavaScriptCore/dfg/DFGOperations.cpp:
(JSC::DFG::JSC_DEFINE_JIT_OPERATION):
* Source/JavaScriptCore/dfg/DFGStrengthReductionPhase.cpp:
(JSC::DFG::StrengthReductionPhase::handleNode):
* Source/JavaScriptCore/runtime/CachedTypes.cpp:
* Source/JavaScriptCore/runtime/CommonIdentifiers.h:
* Source/JavaScriptCore/runtime/JSGlobalObject.cpp:
(JSC::JSGlobalObject::init):
* Source/JavaScriptCore/runtime/JSGlobalObject.h:
* Source/JavaScriptCore/runtime/JSGlobalObjectInlines.h:
(JSC::JSGlobalObject::regExpProtoUnicodeSetsGetter const):
* Source/JavaScriptCore/runtime/RegExp.h:
* Source/JavaScriptCore/runtime/RegExpCache.h:
* Source/JavaScriptCore/runtime/RegExpObject.cpp:
(JSC::RegExpObject::matchGlobal):
* Source/JavaScriptCore/runtime/RegExpPrototype.cpp:
(JSC::RegExpPrototype::finishCreation):
(JSC::JSC_DEFINE_HOST_FUNCTION):
* Source/JavaScriptCore/ucd/emoji-sequences.txt: Added.
* Source/JavaScriptCore/ucd/emoji-zwj-sequences.txt: Added.
* Source/JavaScriptCore/yarr/Yarr.h:
* Source/JavaScriptCore/yarr/YarrErrorCode.cpp:
(JSC::Yarr::errorMessage):
(JSC::Yarr::errorToThrow):
* Source/JavaScriptCore/yarr/YarrErrorCode.h:
* Source/JavaScriptCore/yarr/YarrFlags.h:
* Source/JavaScriptCore/yarr/YarrInterpreter.cpp:
(JSC::Yarr::ByteTermDumper::ByteTermDumper):
(JSC::Yarr::ByteTermDumper::unicode const):
(JSC::Yarr::ByteTermDumper::unicodeSets const):
(JSC::Yarr::ByteTermDumper::eitherUnicode const):
(JSC::Yarr::Interpreter::tryConsumeBackReference):
(JSC::Yarr::Interpreter::matchCharacterClass):
(JSC::Yarr::Interpreter::backtrackCharacterClass):
(JSC::Yarr::Interpreter::matchDisjunction):
(JSC::Yarr::Interpreter::Interpreter):
(JSC::Yarr::Interpreter::isLegacyCompilation const):
(JSC::Yarr::Interpreter::isUnicodeCompilation const):
(JSC::Yarr::Interpreter::isUnicodeSetsCompilation const):
(JSC::Yarr::Interpreter::isEitherUnicodeCompilation const):
(JSC::Yarr::ByteTermDumper::dumpTerm):
(JSC::Yarr::ByteTermDumper::unicode): Deleted.
* Source/JavaScriptCore/yarr/YarrInterpreter.h:
(JSC::Yarr::BytecodePattern::BytecodePattern):
(JSC::Yarr::BytecodePattern::compileMode const):
(JSC::Yarr::BytecodePattern::unicodeSets const):
(JSC::Yarr::BytecodePattern::eitherUnicode const):
* Source/JavaScriptCore/yarr/YarrJIT.cpp:
* Source/JavaScriptCore/yarr/YarrParser.h:
(JSC::Yarr::Parser::CharacterClassParserDelegate::CharacterClassParserDelegate):
(JSC::Yarr::Parser::ClassSetParserDelegate::ClassSetParserDelegate):
(JSC::Yarr::Parser::ClassSetParserDelegate::begin):
(JSC::Yarr::Parser::ClassSetParserDelegate::nestedClassBegin):
(JSC::Yarr::Parser::ClassSetParserDelegate::doneAfterCharacterClassEnd):
(JSC::Yarr::Parser::ClassSetParserDelegate::setUnionOp):
(JSC::Yarr::Parser::ClassSetParserDelegate::setSubtractOp):
(JSC::Yarr::Parser::ClassSetParserDelegate::setIntersectionOp):
(JSC::Yarr::Parser::ClassSetParserDelegate::flushCachedCharacterIfNeeded):
(JSC::Yarr::Parser::ClassSetParserDelegate::atomPatternCharacter):
(JSC::Yarr::Parser::ClassSetParserDelegate::atomBuiltInCharacterClass):
(JSC::Yarr::Parser::ClassSetParserDelegate::end):
(JSC::Yarr::Parser::ClassSetParserDelegate::error):
(JSC::Yarr::Parser::ClassSetParserDelegate::assertionWordBoundary):
(JSC::Yarr::Parser::ClassSetParserDelegate::atomBackReference):
(JSC::Yarr::Parser::ClassSetParserDelegate::atomNamedBackReference):
(JSC::Yarr::Parser::ClassSetParserDelegate::atomNamedForwardReference):
(JSC::Yarr::Parser::ClassStringDisjunctionParserDelegate::ClassStringDisjunctionParserDelegate):
(JSC::Yarr::Parser::ClassStringDisjunctionParserDelegate::atomPatternCharacter):
(JSC::Yarr::Parser::ClassStringDisjunctionParserDelegate::newAlternative):
(JSC::Yarr::Parser::ClassStringDisjunctionParserDelegate::end):
(JSC::Yarr::Parser::ClassStringDisjunctionParserDelegate::assertionWordBoundary):
(JSC::Yarr::Parser::ClassStringDisjunctionParserDelegate::atomBackReference):
(JSC::Yarr::Parser::ClassStringDisjunctionParserDelegate::atomNamedBackReference):
(JSC::Yarr::Parser::ClassStringDisjunctionParserDelegate::atomNamedForwardReference):
(JSC::Yarr::Parser::ClassStringDisjunctionParserDelegate::atomBuiltInCharacterClass):
(JSC::Yarr::Parser::Parser):
(JSC::Yarr::Parser::isIdentityEscapeAnError):
(JSC::Yarr::Parser::parseEscape):
(JSC::Yarr::Parser::consumePossibleSurrogatePair):
(JSC::Yarr::Parser::parseAtomEscape):
(JSC::Yarr::Parser::parseCharacterClassEscape):
(JSC::Yarr::Parser::parseClassSetEscape):
(JSC::Yarr::Parser::parseClassStringDisjunctionEscape):
(JSC::Yarr::Parser::parseCharacterClass):
(JSC::Yarr::Parser::parseClassSet):
(JSC::Yarr::Parser::parseClassStringDisjunction):
(JSC::Yarr::Parser::parseParenthesesEnd):
(JSC::Yarr::Parser::parseTokens):
(JSC::Yarr::Parser::handleIllegalReferences):
(JSC::Yarr::Parser::tryConsumeUnicodeEscape):
(JSC::Yarr::Parser::tryConsumeUnicodePropertyExpression):
(JSC::Yarr::Parser::isLegacyCompilation const):
(JSC::Yarr::Parser::isUnicodeCompilation const):
(JSC::Yarr::Parser::isUnicodeSetsCompilation const):
(JSC::Yarr::Parser::isEitherUnicodeCompilation const):
(JSC::Yarr::compileMode):
(JSC::Yarr::parse):
* Source/JavaScriptCore/yarr/YarrPattern.cpp:
(JSC::Yarr::CharacterClassConstructor::CharacterClassConstructor):
(JSC::Yarr::CharacterClassConstructor::reset):
(JSC::Yarr::CharacterClassConstructor::combiningSetOp):
(JSC::Yarr::CharacterClassConstructor::append):
(JSC::Yarr::CharacterClassConstructor::appendInverted):
(JSC::Yarr::CharacterClassConstructor::putRange):
(JSC::Yarr::CharacterClassConstructor::atomClassStringDisjunction):
(JSC::Yarr::CharacterClassConstructor::performSetOpWith):
(JSC::Yarr::CharacterClassConstructor::performSetOpWithStrings):
(JSC::Yarr::CharacterClassConstructor::performSetOpWithMatches):
(JSC::Yarr::CharacterClassConstructor::hasInverteStrings):
(JSC::Yarr::CharacterClassConstructor::compareUTF32Strings):
(JSC::Yarr::CharacterClassConstructor::sort):
(JSC::Yarr::CharacterClassConstructor::charClass):
(JSC::Yarr::CharacterClassConstructor::mergeRangesFrom):
(JSC::Yarr::CharacterClassConstructor::unionStrings):
(JSC::Yarr::CharacterClassConstructor::intersectionStrings):
(JSC::Yarr::CharacterClassConstructor::subtractionStrings):
(JSC::Yarr::CharacterClassConstructor::asciiOpSorted):
(JSC::Yarr::CharacterClassConstructor::unicodeOpSorted):
(JSC::Yarr::YarrPatternConstructor::YarrPatternConstructor):
(JSC::Yarr::YarrPatternConstructor::resetForReparsing):
(JSC::Yarr::YarrPatternConstructor::atomPatternCharacter):
(JSC::Yarr::YarrPatternConstructor::atomBuiltInCharacterClass):
(JSC::Yarr::YarrPatternConstructor::atomCharacterClassAtom):
(JSC::Yarr::YarrPatternConstructor::atomCharacterClassRange):
(JSC::Yarr::YarrPatternConstructor::atomCharacterClassBuiltIn):
(JSC::Yarr::YarrPatternConstructor::atomClassStringDisjunction):
(JSC::Yarr::YarrPatternConstructor::atomCharacterClassSetOp):
(JSC::Yarr::YarrPatternConstructor::atomCharacterClassPushNested):
(JSC::Yarr::YarrPatternConstructor::atomCharacterClassPopNested):
(JSC::Yarr::YarrPatternConstructor::atomCharacterClassEnd):
(JSC::Yarr::YarrPattern::compile):
(JSC::Yarr::PatternTerm::dump):
(JSC::Yarr::YarrPattern::dumpPatternString):
(JSC::Yarr::YarrPattern::dumpPattern):
* Source/JavaScriptCore/yarr/YarrPattern.h:
(JSC::Yarr::CharacterClass::CharacterClass):
(JSC::Yarr::CharacterClass::hasNonBMPCharacters const):
(JSC::Yarr::CharacterClass::hasOneCharacterSize const):
(JSC::Yarr::CharacterClass::hasOnlyNonBMPCharacters const):
(JSC::Yarr::CharacterClass::hasStrings const):
(JSC::Yarr::CharacterClass::hasSingleCharacters const):
(JSC::Yarr::ClassSet::ClassSet):
(JSC::Yarr::YarrPattern::unicodeSets const):
(JSC::Yarr::YarrPattern::eitherUnicode const):
(JSC::Yarr::YarrPattern::compileMode const):
(JSC::Yarr::CharacterClass::hasNonBMPCharacters): Deleted.
(JSC::Yarr::CharacterClass::hasOneCharacterSize): Deleted.
(JSC::Yarr::CharacterClass::hasOnlyNonBMPCharacters): Deleted.
* Source/JavaScriptCore/yarr/YarrSyntaxChecker.cpp:
(JSC::Yarr::SyntaxChecker::atomClassStringDisjunction):
(JSC::Yarr::SyntaxChecker::atomCharacterClassSetOp):
(JSC::Yarr::SyntaxChecker::atomCharacterClassPushNested):
(JSC::Yarr::SyntaxChecker::atomCharacterClassPopNested):
(JSC::Yarr::checkSyntax):
* Source/JavaScriptCore/yarr/YarrUnicodeProperties.cpp:
(JSC::Yarr::unicodeMatchProperty):
(JSC::Yarr::createUnicodeCharacterClassFor):
(JSC::Yarr::characterClassMayContainStrings):
* Source/JavaScriptCore/yarr/YarrUnicodeProperties.h:
* Source/JavaScriptCore/yarr/generateYarrUnicodePropertyTables.py:
(PropertyData.__init__):
(PropertyData.addMatchString):
(PropertyData.stringsCompare):
(PropertyData):
(PropertyData.sortStrings):
(PropertyData.dumpMatchData):
(PropertyData.convertStringToCppFormat):
(PropertyData.dump):
(PropertyData.dumpAll):
(PropertyData.dumpMayContainStringFunc):
(BinaryProperty.dump):
(SequenceProperty):
(SequenceProperty.__init__):
(SequenceProperty.parsePropertyFile):
(SequenceProperty.dump):
* Source/WebCore/contentextensions/URLFilterParser.cpp:
(WebCore::ContentExtensions::PatternParser::atomClassStringDisjunction):
(WebCore::ContentExtensions::PatternParser::atomCharacterClassSetOp):
(WebCore::ContentExtensions::PatternParser::atomCharacterClassPushNested):
(WebCore::ContentExtensions::PatternParser::atomCharacterClassPopNested):
(WebCore::ContentExtensions::URLFilterParser::addPattern):

Canonical link: https://commits.webkit.org/261188@main


_______________________________________________
webkit-changes mailing list
[email protected]
https://lists.webkit.org/mailman/listinfo/webkit-changes

Reply via email to