Copilot commented on code in PR #18790:
URL: https://github.com/apache/pinot/pull/18790#discussion_r3439621811


##########
pinot-common/src/test/java/org/apache/pinot/common/function/scalar/StringFunctionsTest.java:
##########
@@ -568,4 +568,75 @@ public void testRegexpSubstr() {
     assertEquals(StringFunctions.regexpSubstr("", "a"), null);
     assertEquals(StringFunctions.regexpSubstr("Hello World", "[A-Z][a-z]+"), 
"Hello");
   }
+
+  // ==================== Tests for translate ====================
+
+  @Test
+  public void testTranslate() {
+    // Basic character replacement
+    assertEquals(StringFunctions.translate("hello", "aeiou", "AEIOU"), 
"hEllO");
+
+    // Replacement and deletion: 'c' has no target in 'to' → deleted
+    assertEquals(StringFunctions.translate("abc", "abc", "xy"), "xy");
+
+    // Mixed: replacement and deletion
+    assertEquals(StringFunctions.translate("abcdef", "ace", "XY"), "XbYdf");
+
+    // No characters in 'from' match → input unchanged
+    assertEquals(StringFunctions.translate("hello", "xyz", "123"), "hello");
+
+    // Empty 'from' → input unchanged
+    assertEquals(StringFunctions.translate("hello", "", "abc"), "hello");
+
+    // Empty input → empty result
+    assertEquals(StringFunctions.translate("", "abc", "xyz"), "");
+
+    // All characters deleted (to is empty)
+    assertEquals(StringFunctions.translate("abc", "abc", ""), "");
+
+    // SQL standard example: partial substitution — digits 0-3 map to z/e/r/o; 
digits 4-9 are deleted
+    assertEquals(StringFunctions.translate("12300", "0123456789", "zero"), 
"erozz");
+
+    // Duplicate characters in 'from': first occurrence wins
+    assertEquals(StringFunctions.translate("aaa", "aa", "XY"), "XXX");
+  }
+
+  // ==================== Tests for overlay ====================
+
+  @Test
+  public void testOverlay() {
+    // Basic replacement: length defaults to length of replacement
+    assertEquals(StringFunctions.overlay("hello world", "there", 7), "hello 
there");
+
+    // Explicit length equal to length of replacement — same result
+    assertEquals(StringFunctions.overlay("hello world", "there", 7, 5), "hello 
there");
+
+    // Zero-length deletion: pure insertion
+    assertEquals(StringFunctions.overlay("abcdef", "XY", 3, 0), "abXYcdef");
+
+    // Delete more than replacement length: replacement is shorter than 
deleted span
+    // FROM 3 FOR 4 removes positions 3-6 (cdef), nothing remains after 
position 6
+    assertEquals(StringFunctions.overlay("abcdef", "XY", 3, 4), "abXY");

Review Comment:
   The PR description’s example for `OVERLAY('abcdef' PLACING 'XY' FROM 3 FOR 
4)` shows an output of `abXYf`, but the implementation/tests correctly produce 
`abXY` (deleting `cdef` and inserting `XY`). Consider updating the PR 
description example to avoid confusion when reviewing/using the function.



##########
pinot-common/src/test/java/org/apache/pinot/common/function/scalar/StringFunctionsTest.java:
##########
@@ -568,4 +568,75 @@ public void testRegexpSubstr() {
     assertEquals(StringFunctions.regexpSubstr("", "a"), null);
     assertEquals(StringFunctions.regexpSubstr("Hello World", "[A-Z][a-z]+"), 
"Hello");
   }
+
+  // ==================== Tests for translate ====================
+
+  @Test
+  public void testTranslate() {
+    // Basic character replacement
+    assertEquals(StringFunctions.translate("hello", "aeiou", "AEIOU"), 
"hEllO");
+
+    // Replacement and deletion: 'c' has no target in 'to' → deleted
+    assertEquals(StringFunctions.translate("abc", "abc", "xy"), "xy");
+
+    // Mixed: replacement and deletion
+    assertEquals(StringFunctions.translate("abcdef", "ace", "XY"), "XbYdf");
+
+    // No characters in 'from' match → input unchanged
+    assertEquals(StringFunctions.translate("hello", "xyz", "123"), "hello");
+
+    // Empty 'from' → input unchanged
+    assertEquals(StringFunctions.translate("hello", "", "abc"), "hello");
+
+    // Empty input → empty result
+    assertEquals(StringFunctions.translate("", "abc", "xyz"), "");
+
+    // All characters deleted (to is empty)
+    assertEquals(StringFunctions.translate("abc", "abc", ""), "");
+
+    // SQL standard example: partial substitution — digits 0-3 map to z/e/r/o; 
digits 4-9 are deleted
+    assertEquals(StringFunctions.translate("12300", "0123456789", "zero"), 
"erozz");
+
+    // Duplicate characters in 'from': first occurrence wins
+    assertEquals(StringFunctions.translate("aaa", "aa", "XY"), "XXX");
+  }
+
+  // ==================== Tests for overlay ====================
+
+  @Test
+  public void testOverlay() {
+    // Basic replacement: length defaults to length of replacement
+    assertEquals(StringFunctions.overlay("hello world", "there", 7), "hello 
there");
+
+    // Explicit length equal to length of replacement — same result
+    assertEquals(StringFunctions.overlay("hello world", "there", 7, 5), "hello 
there");
+
+    // Zero-length deletion: pure insertion
+    assertEquals(StringFunctions.overlay("abcdef", "XY", 3, 0), "abXYcdef");
+
+    // Delete more than replacement length: replacement is shorter than 
deleted span
+    // FROM 3 FOR 4 removes positions 3-6 (cdef), nothing remains after 
position 6
+    assertEquals(StringFunctions.overlay("abcdef", "XY", 3, 4), "abXY");
+
+    // Replace at start (position 1)
+    assertEquals(StringFunctions.overlay("abcdef", "Z", 1, 1), "Zbcdef");
+
+    // Replace at end
+    assertEquals(StringFunctions.overlay("abcdef", "Z", 6, 1), "abcdeZ");
+
+    // Replace entire string
+    assertEquals(StringFunctions.overlay("abcdef", "XY", 1, 6), "XY");
+
+    // Empty replacement (deletion only)
+    assertEquals(StringFunctions.overlay("abcdef", "", 3, 2), "abef");
+
+    // Empty input: result is the replacement
+    assertEquals(StringFunctions.overlay("", "abc", 1), "abc");
+
+    // start beyond end: appends replacement
+    assertEquals(StringFunctions.overlay("abc", "XY", 10), "abcXY");
+
+    // length clamped: cannot delete past end of string
+    assertEquals(StringFunctions.overlay("abc", "Z", 2, 100), "aZ");

Review Comment:
   `overlay()` clamps `start` to the valid range and clamps negative `length` 
to 0, but the unit tests don't currently cover these edge cases. Adding 
assertions for `start <= 0` and `length < 0` would prevent regressions in the 
clamping behavior.



##########
pinot-common/src/main/java/org/apache/pinot/common/function/scalar/StringFunctions.java:
##########
@@ -1120,4 +1120,87 @@ public static String regexpSubstr(String input, String 
regexp) {
     Matcher matcher = Pattern.compile(regexp).matcher(input);
     return matcher.find() ? matcher.group() : null;
   }
+
+  /**
+   * Returns a string with every occurrence of a character in {@code from} 
replaced by the corresponding character in
+   * {@code to}. Characters in {@code input} that do not appear in {@code 
from} are left unchanged. If {@code from} is
+   * longer than {@code to}, characters beyond the length of {@code to} are 
deleted from the result. Equivalent to the
+   * SQL standard {@code TRANSLATE(string, from, to)} function supported by 
PostgreSQL, Oracle, and Trino.
+   *
+   * <p>Examples:
+   * <pre>
+   *   translate("hello", "aeiou", "AEIOU") → "hEllO"
+   *   translate("abc", "abc", "xy")        → "xy"   (c has no target → 
deleted)
+   *   translate("abcdef", "ace", "XY")     → "XbYdf" (e has no target → 
deleted)
+   * </pre>
+   *
+   * @param input the source string
+   * @param from  characters to search for (must not be null)
+   * @param to    replacement characters (must not be null; may be shorter 
than {@code from})
+   * @return the translated string
+   */
+  @ScalarFunction
+  public static String translate(String input, String from, String to) {
+    if (input.isEmpty() || from.isEmpty()) {

Review Comment:
   This PR adds a new `translate()` scalar function and corresponding tests, 
but the PR title/description focus on `OVERLAY` (and mention TRANSLATE as if it 
already exists). Please confirm `translate()` is intended to ship in the same 
PR; otherwise, consider splitting it out or updating the PR 
title/description/test plan to reflect the additional public function.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to