> On Oct 11, 2025, at 10:06, Jeff Davis <[email protected]> wrote: > > On Sat, 2025-10-11 at 08:30 +0800, Chao Li wrote: >> * If we make that fail, I don’t think that would break existing >> scripts. Because the default provider is libc and you are introducing >> a new environment variable to set locale provider, thus a plain >> initdb will not use builtin provider. Maybe provider can come from >> PG_TEST_INITDB_EXTRA_OPTS, I'm ok for test environment to only only >> issue warnings. > > I would like it to be possible to change the initdb default in the > future to "builtin". See: > > https://www.postgresql.org/message-id/[email protected] > > in that case, initdb should be able to succeed without other options.
Yes, if we decide to along with that path, then what I talked would no longer be valid. > >> * I am thinking loudly. Builtin provider is more performant but with >> certain limitations. Some production users may want to try builtin >> provider for better performance but not being aware of the >> limitation. Their environment contains the actual LC_CTYPE/LC_COLLATE >> they want to use, and they set the new environment variable with >> “builtin” for provider. In this case, failing “initdb” would make the >> user clearly realize the limitation of builtin provider. Otherwise, >> if the user also ignores the warning messages, then the database >> would be created with unexpected ctype, which would lead to loss >> (time, data, etc.) > > What limitation and/or loss are you concerned about? > For limitation of builtin provide, I just meant it supports less LC_CTYPE/LC_COLLATE than the other two providers. I wasn’t concerned about anything, I was just imaging if anything could get a negative impact. > Unless I'm mistaken, LC_CTYPE has very little practical effect when the > provider is builtin and the encoding is UTF-8. > > The main effect that I'm aware of is that system errors from the OS > rely on LC_CTYPE for translation. Ordinary Postgres messages don't need > LC_CTYPE, so most of NLS still works even with LC_CTYPE=C; it's just > strerror() that depends on LC_CTYPE for the encoding. > > LC_CTYPE also affects full text search parsing, but I'm fixing that as > part of another patch to use the database locale instead. > > I think contrib/fuzzystrmatch may be affected. > > Callers of pg_strcasecmp() could be affected, but it's mostly used to > compare with ascii anyway. > > If you are aware of other areas, please let me know. > Thanks for the explanation. I think I am good now. The latest v3 patch looks good to me. Best regards, -- Chao Li (Evan) HighGo Software Co., Ltd. https://www.highgo.com/
