vladanvasi-db commented on code in PR #48737:
URL: https://github.com/apache/spark/pull/48737#discussion_r1829403446
##########
common/unsafe/src/main/java/org/apache/spark/sql/catalyst/util/CollationFactory.java:
##########
@@ -547,23 +587,41 @@ private static CollationSpecUTF8 fromCollationId(int
collationId) {
// Extract case sensitivity from collation ID.
int caseConversionOrdinal = SpecifierUtils.getSpecValue(collationId,
CASE_SENSITIVITY_OFFSET, CASE_SENSITIVITY_MASK);
+ // Extract utf8 binary collation type from collation ID.
+ int utf8BinaryCollationType = SpecifierUtils.getSpecValue(collationId,
+ UTF8BINARY_COLLATION_TYPE_OFFSET, UTF8BINARY_COLLATION_TYPE_MASK);
// Extract space trimming from collation ID.
int spaceTrimmingOrdinal = getSpaceTrimming(collationId).ordinal();
assert(isValidCollationId(collationId));
return new CollationSpecUTF8(
CaseSensitivity.values()[caseConversionOrdinal],
+ Utf8BinaryCollationType.values()[utf8BinaryCollationType],
SpaceTrimming.values()[spaceTrimmingOrdinal]);
}
private static boolean isValidCollationId(int collationId) {
- collationId = SpecifierUtils.removeSpec(
- collationId,
- SPACE_TRIMMING_OFFSET,
- SPACE_TRIMMING_MASK);
- collationId = SpecifierUtils.removeSpec(
- collationId,
- CASE_SENSITIVITY_OFFSET,
- CASE_SENSITIVITY_MASK);
+ if (SpecifierUtils.getSpecValue(collationId,
UTF8BINARY_COLLATION_TYPE_OFFSET,
+ UTF8BINARY_COLLATION_TYPE_MASK) != 0) {
Review Comment:
This 17th bit is only relevant for the implicit `utf8_binary` collation.
Since it cannot be used with trim modifiers, the collations with those
coresponding ids are not valid.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]