Thomas Munro <thomas.mu...@gmail.com> writes: > On Wed, Jun 8, 2022 at 3:58 AM Rod Taylor <r...@rbt.ca> wrote: >> Is this more involved than creating a list of all valid Unicode characters >> (~144 thousand), sorting them, then running crc32 over the sorted order to >> create the "version" for the library/collation pair? Far from free but few >> databases use more than a couple different collations.
> Collation rules have multiple levels and all kinds of quirks, so that > won't work. Yeah, and it's exactly at the level of quirks that things are likely to change. Nobody's going to suddenly start sorting B before A. They might, say, change their minds about where the digram "cz" sorts relative to single letters, in languages where special rules for that are a thing. The idea of fingerprinting a collation's behavior is interesting, but I've got doubts about whether we can make a sufficiently thorough fingerprint. regards, tom lane