On Tue, Sep 19, 2017 at 7:01 PM, Peter Geoghegan <p...@bowt.ie> wrote: > I didn't post the patch that generates the errors in my opening e-mail > because I'm not confident it's correct just yet. And, I think that I > see a bigger problem: we pass a string that is almost certainly a BCP > 47 string to ucol_open() from within pg_newlocale_from_collation(). We > do so despite the fact that ucol_open() apparently doesn't accept BCP > 47 syntax locale strings until ICU 54 [1]. Seems entirely possible > that this accounts for the problems you saw on ICU 4.2, back when we > were still creating keyword variants (I guess that the keyword > variants seem very "BCP 47-ish" to me).
ISTM that the proper fix here is to use uloc_forLanguageTag() [1] (not to be confused with uloc_toLanguageTag()) to get a valid locale identifier on versions of ICU where BCP 47 format tags are not directly accepted as locale identifiers (versions prior to ICU 54). This would happen as an extra step within pg_newlocale_from_collation(), since BCP 47 format would be what is stored in pg_collation. Since uloc_forLanguageTag() become stable in ICU 4.2, the earliest version that we support, I believe that that would leave us in good shape. [1] https://ssl.icu-project.org/apiref/icu4c/uloc_8h.html#aa45d6457f72867880f079e27a63c6fcb -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers