On 15.03.22 08:41, Julien Rouhaud wrote:
The locale object in ICU is an identifier that specifies a particular locale
and has fields for language, country, and an optional code to specify further
variants or subdivisions. These fields also can be represented as a string
with the fields separated by an underscore.
I think the Localization chapter needs to be reorganized a bit, but I'll
leave that for a separate patch.
WFM.
I ended up writing a bit of content for that chapter.
While on that topic, the doc should probably mention that default ICU
collations can only be deterministic.
Well, there is no option to do otherwise, so I'm not sure where/how to
mention that. We usually don't document options that don't exist. ;-)
Sure, but I'm afraid that users may still be tempted to use ICU locales like
und-u-ks-level2 from the case_insensitive example in the doc and hope that it
will work accordingly. Or maybe it's just me that still sees ICU as dark magic
and want to be extra cautious.
I have added this to the CREATE DATABASE ref page.
Unrelated, but I just realized that we have PGC_INTERNAL gucs for lc_ctype and
lc_collate. Should we add one for icu_locale too?
I'm not sure. I think the existing ones are more for backward
compatibility with the time before it was settable per database.
in AlterCollation(), pg_collation_actual_version(), AlterDatabaseRefreshColl()
and pg_database_collation_actual_version():
- datum = SysCacheGetAttr(COLLOID, tup, Anum_pg_collation_collcollate,
&isnull);
- Assert(!isnull);
- newversion = get_collation_actual_version(collForm->collprovider,
TextDatumGetCString(datum));
+ datum = SysCacheGetAttr(COLLOID, tup, collForm->collprovider ==
COLLPROVIDER_ICU ? Anum_pg_collation_colliculocale : Anum_pg_collation_collcollate,
&isnull);
+ if (!isnull)
+ newversion = get_collation_actual_version(collForm->collprovider,
TextDatumGetCString(datum));
+ else
+ newversion = NULL;
The columns are now nullable, but can you actually end up with a null locale
name in the expected field without manual DML on the catalog, corruption or
similar? I think it should be a plain error explaining the inconsistency
rather then silently assuming there's no version. Note that at least
pg_newlocale_from_collation() asserts that the specific libc/icu field it's
interested in isn't null.
This is required because the default collations's fields are null now.
Yes I saw that, but that's a specific exception. Detecting whether it's the
DEFAULT_COLLATION_OID or not and raise an error when a null value isn't
expected seems like it could be worthwhile.
I have fixed that as you suggest.
So apart from the few details mentioned above I'm happy with this patch!
committed!