On Tue, Jun 7, 2022 at 03:43:32PM -0400, Tom Lane wrote: > Thomas Munro <thomas.mu...@gmail.com> writes: > > On Wed, Jun 8, 2022 at 3:58 AM Rod Taylor <r...@rbt.ca> wrote: > >> Is this more involved than creating a list of all valid Unicode characters > >> (~144 thousand), sorting them, then running crc32 over the sorted order to > >> create the "version" for the library/collation pair? Far from free but few > >> databases use more than a couple different collations. > > > Collation rules have multiple levels and all kinds of quirks, so that > > won't work. > > Yeah, and it's exactly at the level of quirks that things are likely > to change. Nobody's going to suddenly start sorting B before A. > They might, say, change their minds about where the digram "cz" > sorts relative to single letters, in languages where special rules > for that are a thing. > > The idea of fingerprinting a collation's behavior is interesting, > but I've got doubts about whether we can make a sufficiently thorough > fingerprint.
Rather than trying to figure out if the collations changed, have we ever considered checking if index additions and lookups don't match the OS collation and reporting these errors somehow? -- Bruce Momjian <br...@momjian.us> https://momjian.us EDB https://enterprisedb.com Indecision is a decision. Inaction is an action. Mark Batterson