On Thu, 8 Jan 2026 04:44:55 GMT, Alexey Semenyuk <[email protected]> wrote:
> Replace reluctant quantifier `*?` with the possessive alternative (`*+`) and > get rid of back-references from the regexp tokenizing a value of the > "--arguments" option into a string array to fix the catastrophic backtracking > resulting in a stack overflow. > > Old regexp: `(?:(?:(["'])(?:\\\1|.)*?(?:\1|$))|(?:\["'\s]|[^\s]))++` > > New regexp > `(?:(?:(?:'(?:\'|[^'])*+(?:'|$))|(?:"(?:\"|[^"])*+(?:"|$)))|(?:\["'\s]|\S))++` > > Add test cases that pass both the old and the new variants of the regexp, > except for the last test case that causes a stack overflow with the old > regexp. > > The initial intention was to replace the regexp with the tokenizer function. > It was abandoned in favor of reworking the regexp to minimize the risk of > regressions. @sashamatveev PTAL @fguallini can you think of any inputs to stress test/break the new regexp? ------------- PR Comment: https://git.openjdk.org/jdk/pull/29104#issuecomment-3724876512 PR Comment: https://git.openjdk.org/jdk/pull/29104#issuecomment-3724892108
