On Fri, Sep 22, 2017 at 8:56 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: > The big concern I have here is that this feels a lot like something that > we'll regret at leisure, if it's not right in the first release. I'd > much rather be restrictive in v10 and then loosen the rules later, than > be lax in v10 and then have to argue about whether to break backwards > compatibility in order to gain saner behavior.
To the bests of my knowledge, the only restriction implied by limiting ourselves to the BCP 47 format (as part of standardizing what is stored in pg_collation) is that users might know about the traditional locale strings from some other place, and be surprised when their knowledge doesn't transfer to Postgres. Personally, I don't think that that's a big deal. If it actually is important, then I'm surprised that it took this long for a doc change mentioning it to be proposed (though the docs *do* say "Collations provided by ICU are created with names in BCP 47 language tag format"). >> We have never canonicalized collations before and therefore it is not >> essential that we do that now. > > Actually, we try; see initdb.c's check_locale_name(). It's not our > fault that setlocale(3) fails to play along on many platforms. But it will be our fault if we ship a v10 that does the kind of unsettled canonicalization you see within pg_import_system_collations() (the "collcollate = U_ICU_VERSION_MAJOR_NUM >= 54 ? langtag : name" thing). That looks very much like the tail wagging the dog to me. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers