Robert Haas wrote: > To be honest, I'd probably be ready to support making the default > encoding UTF8 regardless of the environment, and you have to use -E > if you want anything else. I think there are still people using > other encodings, but I believe it to be a small minority at this > point.
It would be interesting to have the point of view of Asian users about this. Recently, the suggestion to retire GB18030 in favor of UTF-8 was met with the objection that GB18030 was likely preferred by users from China [1]. Another example against UTF-8 that I found notable, is Tatsuo Ishii mentioning that Japanese users tend use --no-locale rather than UTF-8 locales [2]. Also, it's not obvious how initdb could choose an UTF-8 locale regardless of the environment. For instance, let's say it finds LC_ALL="fr_FR.iso885915@euro", what would it do? Maybe look at the UTF-8 locales on the system. Here's a subset of what it would find on my system: C.utf8 en_AG en_AG.utf8 en_AU.utf8 en_BW.utf8 en_CA.utf8 en_DK.utf8 en_GB.utf8 en_HK.utf8 en_IE.utf8 ... tr_TR.utf8 From that kind of list, which locale should it pick and why? Personally I think that ignoring the environment's LC_* for the collations would be fine if we went for builtin/C.UTF-8 by default, as $subject suggests. But the level of enthusiasm for that from the community seems much lower than it would need to be for that kind of change to be acceptable. [1] https://www.postgresql.org/message-id/45b4b689-0e78-4d30-a5f9-1a39d01ab2b7%40ww-it.cn [2] https://www.postgresql.org/message-id/20230608.104535.2171011311090815110.t-ishii%40sranhm.sra.co.jp Best regards, -- Daniel Vérité https://postgresql.verite.pro/
