Re: [PR] [FLINK-39650][table] REGEXP_REPLACE plan-time validation and hot-path log cleanup [flink]

via GitHub Tue, 19 May 2026 05:23:23 -0700


dylanhz commented on code in PR #28189:
URL: https://github.com/apache/flink/pull/28189#discussion_r3266180383



##########
flink-table/flink-table-runtime/src/main/java/org/apache/flink/table/runtime/functions/SqlFunctionUtils.java:
##########
@@ -422,21 +422,20 @@ public static String splitIndex(String str, int 
character, int index) {
 
     /**
      * Returns a string resulting from replacing all substrings that match the 
regular expression
-     * with replacement.
+     * with replacement. Literal regexes are validated at planning time by the 
input type strategy.
      */
     public static String regexpReplace(String str, String regex, String 
replacement) {
         if (str == null || regex == null || replacement == null) {
             return null;
         }
         try {
-            return str.replaceAll(regex, 
Matcher.quoteReplacement(replacement));
-        } catch (Exception e) {
-            LOG.error(
-                    String.format(
-                            "Exception in regexpReplace('%s', '%s', '%s')",
-                            str, regex, replacement),
-                    e);
-            // return null if exception in regex replace
+            return REGEXP_PATTERN_CACHE

Review Comment:
   > not sure we need it
   > 
   > instead of this better to make it reuse on code gen level then no need to 
even enter/invoke same method second time
   
   I’d lean towards handling this separately, ideally with a benchmark.
   
   This PR already removes the more expensive per-record `Pattern.compile`. 
Skipping `REGEXP_PATTERN_CACHE.get(...)` for literal regexps seems like a 
narrower optimization, and may add some complexity to the function/codegen 
path. If we pursue it, it would be better to apply it consistently across the 
regexp family in a follow-up PR.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] [FLINK-39650][table] REGEXP_REPLACE plan-time validation and hot-path log cleanup [flink]

Reply via email to