Branch: refs/heads/main
  Home:   https://github.com/WebKit/WebKit
  Commit: 67969c218ddf357855d3c26ca4769b194fb1f4db
      
https://github.com/WebKit/WebKit/commit/67969c218ddf357855d3c26ca4769b194fb1f4db
  Author: Michael Saboff <msab...@apple.com>
  Date:   2024-03-13 (Wed, 13 Mar 2024)

  Changed paths:
    A JSTests/stress/regexp-unicode-dangling-surrogates.js
    M JSTests/test262/expectations.yaml
    M Source/JavaScriptCore/assembler/MacroAssemblerARM64.h
    M Source/JavaScriptCore/yarr/YarrInterpreter.cpp
    M Source/JavaScriptCore/yarr/YarrJIT.cpp
    M Source/JavaScriptCore/yarr/YarrJITRegisters.h

  Log Message:
  -----------
  [JSC] RegExp /u flag doesn't respect atomicity of surrogate pairs
https://bugs.webkit.org/show_bug.cgi?id=267011
rdar://124217243

Reviewed by Yusuke Suzuki.

Fixed bug where a dangling surrogate in a pattern matches half a valid 
surrogate pair in a subject string.
Updated the reading of surrogates that when we read starting in the middle of a 
valid surrogate pair, we return an error
code point which we never match.  Updated backtracking for non-greedy character 
class matching to use the start index
as the appropriate index to reset when we fail to match, instead of doing math 
with the current match count.

The fix above originally landed Jan 13, but it  regressed some Unicode 
performance tests and was subsequently rolled out.

This change builds on the prior fix by adding three optimizations to mitigate 
the performance loss in the earlier fix..
 1. We don't need to check for the errorCodePoint (-1) if we read a dangling 
surrogate when we are matching a normal,
    non-inverted atoms.  The errorCodePoint won't match in that case.  For 
inverted atoms, we still need to check for the
    errorCodePoint and fail matching that atom.

 2. Changed the code emitted for a character class that has only one range.  
Before this change, we'd emit all range
    checks with each range check's failure target address the instruction right 
after the two conditional branches.
    This works fine if there is another range check.  When all range checks 
have been performed, we add a branch to the
    failure (backtracking) code.

    If the character class has only one range and doesn't have any list of 
single characters, we can eliminate the branch
    to failure code by changing the two conditional branches that make up a 
range check go directly to the failure code.

    This change appears to help JetStream2/babylon-wtp by at least 1.5+%.

 3. (ARM64 only) When we read a non-BMB code point, consisting of two surrogate 
code units, and we fail to match any atom
    in the body of a RegExp, we were incrementing the subject string index by 1 
and going back to the top of the loop to
    start matching the pattern again.  Now we dedicate a register to hold 
either 0 or 1 depending on the width of the first
    character read for that loop iteration.  When advancing the index for the 
next iteration, we add the value of that register
    to the updated index.  This eliminates one iteration through the matching 
loop for each non-BMP code point that doesn't
    match.

    This change appears to help JetStream2/UniPoker by 3+%.

Added a new test and updated the Test262 exceptions file.

* JSTests/stress/regexp-unicode-dangling-surrogates.js: Added.
(arrayToString):
(objectToString):
(dumpValue):
(compareArray):
(compareGroups):
(testRegExp):
(testRegExpSyntaxError):
* JSTests/test262/expectations.yaml:
* Source/JavaScriptCore/assembler/MacroAssemblerARM64.h:
(JSC::MacroAssemblerARM64::moveConditionallyTest32): Added to conditionally 
zero a register.
(JSC::MacroAssemblerARM64::addOneConditionally32): Added to conditionally 
increment a register.
* Source/JavaScriptCore/yarr/YarrInterpreter.cpp:
(JSC::Yarr::Interpreter::InputStream::readChecked):
* Source/JavaScriptCore/yarr/YarrJIT.cpp:
* Source/JavaScriptCore/yarr/YarrJITRegisters.h:

Canonical link: https://commits.webkit.org/276031@main



To unsubscribe from these emails, change your notification settings at 
https://github.com/WebKit/WebKit/settings/notifications
_______________________________________________
webkit-changes mailing list
webkit-changes@lists.webkit.org
https://lists.webkit.org/mailman/listinfo/webkit-changes

Reply via email to