dbatomic commented on code in PR #45791:
URL: https://github.com/apache/spark/pull/45791#discussion_r1555413439
##########
common/unsafe/src/main/java/org/apache/spark/sql/catalyst/util/CollationFactory.java:
##########
@@ -176,15 +176,31 @@ public Collation(
*/
public static StringSearch getStringSearch(
- final UTF8String left,
- final UTF8String right,
+ final UTF8String targetUTF8String,
+ final UTF8String patternUTF8String,
final int collationId) {
- String pattern = right.toString();
- CharacterIterator target = new StringCharacterIterator(left.toString());
+
+ if (collationId == UTF8_BINARY_COLLATION_ID) {
+ return getStringSearch(targetUTF8String, patternUTF8String);
Review Comment:
Why would we ever do this against UTF8_BINARY? For UTF8_BINARY we should
just stay on binary level and avoid string copy.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]