uros-db commented on code in PR #46206:
URL: https://github.com/apache/spark/pull/46206#discussion_r1579023601
##########
common/unsafe/src/main/java/org/apache/spark/sql/catalyst/util/CollationSupport.java:
##########
@@ -214,6 +214,207 @@ public static int execICU(final UTF8String string, final
UTF8String substring,
}
}
+ public static class StringTrim {
+ public static UTF8String exec(
+ final UTF8String srcString,
+ final int collationId) {
+ CollationFactory.Collation collation =
CollationFactory.fetchCollation(collationId);
+ if (collation.supportsBinaryEquality) {
+ return execBinary(srcString);
+ } else {
+ return execICU(srcString, collationId);
+ }
+ }
+ public static UTF8String exec(
+ final UTF8String srcString,
+ final UTF8String trimString,
+ final int collationId) {
+ CollationFactory.Collation collation =
CollationFactory.fetchCollation(collationId);
+ if (collation.supportsBinaryEquality) {
+ return execBinary(srcString, trimString);
+ } else {
+ return execICU(srcString, trimString, collationId);
+ }
Review Comment:
that was indeed the case, but now it would be best if we could pull off
separating these things
- static classes for expressions in CollationSupport should handle
"rounting" of binary/lowecase/ICU implementations
- CollationAwareUTF8String should "extend" UTF8String with appropriate
collation awareness, but shouldn't branch based on collation
for example, `CollationAwareUTF8String.trimLeft` should no longer call
`CollationFactory.fetchCollation(collationId).supportsLowercaseEquality`, but
execLowercase implementation in the appropriate `CollationSupport` class should
do trimString.toLowerCase() prior to calling it
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]