On Fri, 28 Jul 2023 12:15:21 GMT, Cristian Vat <[email protected]> wrote:
>> Reduces excessive allocation of Matcher.groups array when the original
>> Pattern has no groups or less than 9 groups.
>>
>> Original clamping to 10 possibly due to documented behavior from javadoc:
>> "In this class, \1 through \9 are always interpreted as back references, "
>>
>> Only with Matcher changes RegExTest.backRefTest fails when backreferences to
>> non-existing groups are present.
>> Added a match failure condition in Pattern that fixes failing tests.
>>
>> As per existing `java.util.regex.Pattern.BackRef#match`: "// If the
>> referenced group didn't match, neither can this"
>>
>> A group that does not exist in the original Pattern can never match so
>> neither can a backref to that group.
>> If the group existed in the original Pattern then it would have had space
>> allocated in Matcher.groups for that group index.
>> So a group index outside groups array length must never match.
>
> Cristian Vat has updated the pull request incrementally with one additional
> commit since the last revision:
>
> changes and test for CIBackRef
Made changes also in `CIBackRef` and copied/changed test into new
`ciBackRefTest`
Not pretty since changes to one could miss the other, but all patterns are
different. (for what it's worth tests pass...)
There's also a special case with supplementary character tests since
`toSupplementaries` given `(?i)` generates invalid pattern, so I had to change
those like:
``
pattern = Pattern.compile("(?i)" + toSupplementaries("(a*)bc\\1"));
``
Definitely needs a close look by a regex expert.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/14894#issuecomment-1655581689