On Thu, 8 Jan 2026 04:44:55 GMT, Alexey Semenyuk <[email protected]> wrote:

> Replace reluctant quantifier `*?` with the possessive alternative (`*+`) and 
> get rid of back-references from the regexp tokenizing a value of the 
> "--arguments" option into a string array to fix the catastrophic backtracking 
> resulting in a stack overflow.
> 
> Old regexp: `(?:(?:(["'])(?:\\\1|.)*?(?:\1|$))|(?:\["'\s]|[^\s]))++`
> 
> New regexp 
> `(?:(?:(?:'(?:\'|[^'])*+(?:'|$))|(?:"(?:\"|[^"])*+(?:"|$)))|(?:\["'\s]|\S))++`
> 
> Add test cases that pass both the old and the new variants of the regexp, 
> except for the last test case that causes a stack overflow with the old 
> regexp.
> 
> The initial intention was to replace the regexp with the tokenizer function. 
> It was abandoned in favor of reworking the regexp to minimize the risk of 
> regressions.

@sashamatveev PTAL

@fguallini can you think of any inputs to stress test/break the new regexp?

-------------

PR Comment: https://git.openjdk.org/jdk/pull/29104#issuecomment-3724876512
PR Comment: https://git.openjdk.org/jdk/pull/29104#issuecomment-3724892108

Reply via email to