uros-db commented on code in PR #46762:
URL: https://github.com/apache/spark/pull/46762#discussion_r1667892969


##########
common/unsafe/src/main/java/org/apache/spark/sql/catalyst/util/CollationAwareUTF8String.java:
##########
@@ -657,57 +659,64 @@ public static Map<String, String> 
getCollationAwareDict(UTF8String string,
   public static UTF8String lowercaseTrim(
       final UTF8String srcString,
       final UTF8String trimString) {
+    return lowercaseTrimRight(lowercaseTrimLeft(srcString, trimString), 
trimString);
+  }
+
+  public static UTF8String trim(
+      final UTF8String srcString,
+      final UTF8String trimString,
+      final int collationId) {
+    return trimRight(trimLeft(srcString, trimString, collationId), trimString, 
collationId);
+  }
+
+  public static UTF8String lowercaseTrimLeft(
+      final UTF8String srcString,
+      final UTF8String trimString) {
     // Matching UTF8String behavior for null `trimString`.
     if (trimString == null) {
       return null;
     }
 
-    UTF8String leftTrimmed = lowercaseTrimLeft(srcString, trimString);
-    return lowercaseTrimRight(leftTrimmed, trimString);
+    HashSet<Integer> trimChars = new HashSet<>();
+    Iterator<Integer> trimIter = trimString.codePointIterator();
+    while (trimIter.hasNext()) 
trimChars.add(UCharacter.toLowerCase(trimIter.next()));
+
+    int searchIndex = 0;
+    Iterator<Integer> srcIter = srcString.codePointIterator();
+    while (srcIter.hasNext()) {
+      if (!trimChars.contains(UCharacter.toLowerCase(srcIter.next()))) break;
+      ++searchIndex;
+    }
+
+    return srcString.substring(searchIndex, srcString.numChars());
   }
 
-  public static UTF8String lowercaseTrimLeft(
+  public static UTF8String trimLeft(

Review Comment:
   yeah, I think all of these should be public



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to