On 09/21/2017 01:40 AM, Peter Geoghegan wrote:
On Wed, Sep 20, 2017 at 4:08 PM, Peter Geoghegan <p...@bowt.ie> wrote:
pg_import_system_collations() takes care to use the non-BCP-47 style for
such versions, so I think this is working correctly.

But CREATE COLLATION doesn't use pg_import_system_collations().

And perhaps more to the point: it highly confusing that we use one or
the other of those 2 things ("langtag"/BCP 47 tag or "name"/legacy
locale name) as "colcollate", depending on ICU version, thereby
*behaving* as if ICU < 54 really didn't know anything about BCP 47
tags. Because, obviously earlier ICU versions know plenty about BCP
47, since 9 lines further down we use "langtag"/BCP 47 tag as collname
for CollationCreate() -- regardless of ICU version.

How can you say "ICU <54 doesn't even support the BCP 47 style", given
all that? Those versions will still have locales named "*-x-icu" when
users do "\dOS". Users will be highly confused when they quite
reasonably try to generalize from the example in the docs and what
"\dOS" shows, and get results that are wrong, often only in a very
subtle way.

If we are fine with supporting only ICU 4.2 and later (which I think we are given that ICU 4.2 was released in 2009) then using uloc_forLanguageTag()[1] to validate and canonize seems like the right solution. I had missed that this function even existed when I last read the documentation. Does it return a BCP 47 tag in modern versions of ICU?

I strongly prefer if there, as much as possible, is only one format for inputting ICU locales.

1. http://www.icu-project.org/apiref/icu4c/uloc_8h.html#aa45d6457f72867880f079e27a63c6fcb

Andreas



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to