dbatomic commented on code in PR #45064:
URL: https://github.com/apache/spark/pull/45064#discussion_r1488536286


##########
common/unsafe/src/main/java/org/apache/spark/unsafe/types/UTF8String.java:
##########
@@ -1410,6 +1422,13 @@ public boolean equals(final Object other) {
     }
   }
 
+  /**
+   * Collation-aware equality comparison of two UTF8String.
+   */
+  public boolean semanticEquals(final UTF8String other, int collationId) {
+    return 
CollationFactory.fetchCollation(collationId).equalsFunction.apply(this, other);

Review Comment:
   Would it be ok to do benchmarks as a follow up PR? We definitely do need 
benchmark strategy for both regular string ops and collation special cases.
   
   The closest we have so far in this space is `CharVarcharBenchmark` which is 
not really what we need.
   
   IMO, this PR is getting a bit too large for my taste so I would prefer to 
split it into smaller parts. What do you think?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to