nikolamand-db commented on code in PR #46180:
URL: https://github.com/apache/spark/pull/46180#discussion_r1606717497
##########
common/unsafe/src/main/java/org/apache/spark/sql/catalyst/util/CollationFactory.java:
##########
@@ -117,76 +119,445 @@ public Collation(
}
/**
- * Constructor with comparators that are inherited from the given collator.
+ * collation id (32-bit integer) layout:
+ * bit 31: 0 = predefined collation, 1 = user-defined collation
+ * bit 30-29: 00 = utf8-binary, 01 = ICU, 10 = indeterminate (without spec
implementation)
+ * bit 28: 0 for utf8-binary / 0 = case-sensitive, 1 = case-insensitive
for ICU
+ * bit 27: 0 for utf8-binary / 0 = accent-sensitive, 1 =
accent-insensitive for ICU
+ * bit 26-25: zeroes, reserved for punctuation sensitivity
+ * bit 24-23: zeroes, reserved for first letter preference
+ * bit 22-21: 00 = unspecified, 01 = to-lower, 10 = to-upper
Review Comment:
Outdated, removed support for lower/upper conversions with sole exception of
`UTF8_BINARY_LCASE`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]