nikolamand-db commented on code in PR #46180: URL: https://github.com/apache/spark/pull/46180#discussion_r1593910153
########## common/unsafe/src/main/java/org/apache/spark/sql/catalyst/util/CollationFactory.java: ########## @@ -117,76 +119,445 @@ public Collation( } /** - * Constructor with comparators that are inherited from the given collator. + * collation id (32-bit integer) layout: + * bit 31: 0 = predefined collation, 1 = user-defined collation + * bit 30-29: 00 = utf8-binary, 01 = ICU, 10 = indeterminate (without spec implementation) Review Comment: My opinion is that it would be better to stick with this naming with following reasoning. If we used bit 31 for indeterminate collation, we would shrink user collation space because we use additional bit to distinguish between predefined and user collations. It's more convenient to distinguish this at first bit since indeterminate collation falls into predefined space. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org