stefankandic commented on code in PR #45820:
URL: https://github.com/apache/spark/pull/45820#discussion_r1548113382


##########
common/unsafe/src/main/java/org/apache/spark/unsafe/types/UTF8String.java:
##########
@@ -1155,6 +1156,71 @@ public UTF8String translate(Map<String, String> dict) {
     return fromString(sb.toString());
   }
 
+  public UTF8String translate(Map<String, String> dict, int collationId) {
+    if(CollationFactory.fetchCollation(collationId).supportsBinaryEquality) {
+      return translate(dict);
+    }
+    return collationAwareTranslate(dict, collationId);
+  }
+
+  public UTF8String collationAwareTranslate(Map<String, String> dict, int 
collationId) {
+    if (numBytes == 0) {
+      return this;

Review Comment:
   I don't see this behaviour in regular translate. So won't this mean that 
regular translate returns a new string and collation aware one would return the 
same instance if called on an empty string? Probably not what we want.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to