On Tue, Nov 29, 2022 at 3:55 PM Jeff Davis <pg...@j-davis.com> wrote: > =# select * from pg_icu_collation_versions('en_US') order by > icu_version; > icu_version | uca_version | collator_version > -------------+-------------+------------------ > 50.2 | 6.2 | 58.0.6.50 > 51.3 | 6.2 | 58.0.6.50 > 52.2 | 6.2 | 58.0.6.50 > 53.2 | 6.3 | 137.51 > 54.2 | 7.0 | 137.56 > 55.2 | 7.0 | 153.56 > 56.2 | 8.0 | 153.64 > 57.2 | 8.0 | 153.64 > 58.3 | 9.0 | 153.72 > 59.2 | 9.0 | 153.72 > 60.3 | 10.0 | 153.80 > 61.2 | 10.0 | 153.80 > 62.2 | 11.0 | 153.88 > 63.2 | 11.0 | 153.88 > 64.2 | 12.1 | 153.97 > 65.1 | 12.1 | 153.97 > 66.1 | 13.0 | 153.14 > 67.1 | 13.0 | 153.14 > 68.2 | 13.0 | 153.14 > 69.1 | 13.0 | 153.14 > 70.1 | 14.0 | 153.112 > (21 rows) > > This is good information, because it tells us that major library > versions change more often than collation versions, empirically- > speaking.
Wow, nice discovery about 104 -> 14. Yeah, I imagine we'll want some kind of band-aid to tolerate that exact screwup and avoid spurious warnings. Bugs aside, that's quite a revealing table in other ways. We can see: * The version scheme changed completely in ICU 53. This corresponds to a major rewrite of the collation code, I see[1]. * The first component seems to be (UCOL_RUNTIME_VERSION << 4) + 9. UCOL_RUNTIME_VERSION is in their uvernum.h, currently 9, was 8, bumped between 54 and 55 (I see this in their commit log), corresponding to the two possible numbers 137 and 153 that we see there. I don't know where the final 9 term is coming from but it looks stable since the v2 collation rewrite landed. * The second component seems to be uca_version_major * 8 + uca_version_minor (that's the Unicode Collation Algorithm version, and so far always matches the Unicode version, visible in the output of the other function). * The values you showed for English don't have a third component, but if you try some other locales like 'zh' you'll see the CLDR major version in third position. So I guess some locales depend on CLDR data and others don't. TL;DR it *looks* like the set of ingredients for the version string is: * UCOL_RUNTIME_VERSION (rarely changes) * UCA/Unicode major.minor version * sometimes CLDR major version, not sure when * 9 [1] https://icu.unicode.org/design/collation/v2