xiongbo-sjtu commented on PR #45942: URL: https://github.com/apache/spark/pull/45942#issuecomment-2046482693
> `Pattern.compile` is operating on a new instance of `Pattern` object - and so inherits the thread safety of `Pattern`. It is not clear to me what the issue being observed is, and how this PR will help towards it. You're right -- I cannot see any obvious thread-safety issue in [the OpenJDK implementation](https://github.com/openjdk/jdk/blob/b81b86da9849fbc4fb341bff8a23d10aee9967b3/src/java.base/share/classes/java/util/regex/Pattern.java#L1101). "Failed to parse time string" in the stack traces indicates that JavaUtils.timeStringAs throws the following exception even if the input is a legitimate time string like 120s or 3s. It seems that some internal state is corrupted but I'm not sure about the root cause. ``` Matcher m = TIME_PATTERN.matcher(lower); if (!m.matches()) { throw new NumberFormatException("Failed to parse time string: " + str); } ``` To mitigate the issue, I've updated the PR to add a fallback logic so that a critical path like RpcUtils$.askRpcTimeout is less vulnerable to the regex logic. Note that [issues]((https://bugs.openjdk.org/issues/?jql=text%20~%20%22regex%20Pattern%22%20ORDER%20BY%20created%20DESC)) were reported from time to time for java.util.regex, for example, - [JDK-6328855](https://bugs.openjdk.org/browse/JDK-6328855) - [JDK-6609854](https://bugs.openjdk.org/browse/JDK-6609854) Please take a look of the revised PR and let me know if it looks good to you. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
