[First sent on 2021-05-03. Resending because it has not been fully handled.]
https://posix.rhansen.org/p/gettext_draft says (line 343..345) "For each locale name in LANGUAGE, or if LANGUAGE is not set or is empty, or no suitable messages object is found in processing LANGUAGE, the pathname used to locate the messages object shall be dirname/localename/categoryname/textdomainname.mo, where: ... For the LANGUAGE search, the localename part is each locale name from LANGUAGE in turn .... For the single-locale search, the localename part is the name of the current locale, or the locale specified in an *_l() function call, for the category named by categoryname." This text is ambiguous. The first cited paragraph says that it looks in a single directory; the second cited paragraph says that it tries locale names "in turn". This is contradictory. Also when it says "in turn" it does not say what the stopping condition it: does it loop - until an existing locale name is found? - until a file dirname/localename/categoryname/textdomainname.mo is found? - until a file dirname/localename/categoryname/textdomainname.mo is found that contains a translation for the given msgid? For most of the interpretations of this set of paragraphs, this is NOT how GNU gettext behaves. If POSIX standardizes it like this, GNU libc and GNU gettext will have the choice among (a) looking in different (and fewer) directories than they do today, causing major i18n dysfunctionality to users, until the users have set up lots of symbolic links between directories, or (b) violating POSIX in this point. I will vote for (b). Namely, what GNU gettext does is to look in SEVERAL (not ONE) directories per LANGUAGE element. This is true for *both* the LANGUAGE search and the single-locale search. The localename parts of these directories are constructed from the language identifier (element of LANGUAGE) or locale name. For example: * The language identifier 'de' gives rise to the localename part de * The language identifier 'de_AT' gives rise to the localename parts de_AT de * The locale name 'de_AT.UTF-8' gives rise to the localename parts de_AT.UTF-8 de_AT.utf8 de_AT de.UTF-8 de.utf8 de * The locale name 'uz_UZ.UTF-8@cyrillic gives rise to the localename parts uz_UZ.UTF-8@cyrillic uz_UZ.utf8@cyrillic uz_UZ@cyrillic uz.UTF-8@cyrillic uz.utf8@cyrillic uz@cyrillic uz_UZ.UTF-8 uz_UZ.utf8 uz_UZ uz.UTF-8 uz.utf8 uz This list of directories is important for people who live in communities which often (but not always) have translations of their own but can read translations for other locales. In the examples above: * A user in Austria prefers translations for Austrian German, but can also read German with no problem. * A user in Uzbekistan may prefer translations in Cyrillic but can also read translations in Latin. [1] If above text was adopted, it would have the consequences that 1) Many symbolic links are needed in /usr/share/locale/. Solaris 11.4 is a system that implements gettext() as described in above text, and it has the links shown below [2]. 2) Users who want to create a new locale (e.g. for English in Australia) will have to create a symlink /usr/share/locale/en_AU -> /usr/share/locale/en and so on for each custom locale. 3) Users who install packages in non-privileged directories (for GNU programs, that's the --prefix=PREFIX option) will have to create the same amount of symbolic links in their PREFIX/share/locale/ directory. 4) Users will have to set fallback logic in their LANGUAGE environment variable LANGUAGE=de_AT:de_DE instead of having it built-in: LANGUAGE=de_AT This is BAD, BAD, BAD. Bruno [1] https://en.wikipedia.org/wiki/Uzbek_alphabet [2] $ ls -l /usr/share/locale total 102 drwxr-xr-x 3 root other 3 Oct 13 2018 C drwxr-xr-x 3 root other 4 Oct 13 2018 de lrwxrwxrwx 1 root root 2 Oct 13 2018 de_DE -> de lrwxrwxrwx 1 root root 2 Oct 13 2018 de_DE.ISO8859-1 -> de lrwxrwxrwx 1 root root 2 Oct 13 2018 de_DE.ISO8859-15 -> de lrwxrwxrwx 1 root root 2 Oct 13 2018 de_DE.UTF-8 -> de lrwxrwxrwx 1 root root 2 Oct 13 2018 de.ISO8859-15 -> de drwxr-xr-x 3 root other 3 Oct 13 2018 de.us-ascii lrwxrwxrwx 1 root root 2 Oct 13 2018 de.UTF-8 -> de drwxr-xr-x 3 root other 3 Oct 13 2018 en drwxr-xr-x 3 root other 3 Oct 13 2018 en_US drwxr-xr-x 3 root other 3 Oct 13 2018 en@boldquot drwxr-xr-x 3 root other 3 Oct 13 2018 en@quot drwxr-xr-x 3 root other 3 Oct 13 2018 en@shaw drwxr-xr-x 3 root other 4 Oct 13 2018 es drwxr-xr-x 3 root other 3 Oct 13 2018 es_ES lrwxrwxrwx 1 root root 2 Oct 13 2018 es_ES.ISO8859-1 -> es lrwxrwxrwx 1 root root 2 Oct 13 2018 es_ES.ISO8859-15 -> es lrwxrwxrwx 1 root root 2 Oct 13 2018 es_ES.UTF-8 -> es lrwxrwxrwx 1 root root 2 Oct 13 2018 es.ISO8859-15 -> es lrwxrwxrwx 1 root root 2 Oct 13 2018 es.UTF-8 -> es drwxr-xr-x 3 root other 4 Oct 13 2018 fr lrwxrwxrwx 1 root root 2 Oct 13 2018 fr_FR -> fr lrwxrwxrwx 1 root root 2 Oct 13 2018 fr_FR.ISO8859-1 -> fr lrwxrwxrwx 1 root root 2 Oct 13 2018 fr_FR.ISO8859-15 -> fr lrwxrwxrwx 1 root root 2 Oct 13 2018 fr_FR.UTF-8 -> fr lrwxrwxrwx 1 root root 2 Oct 13 2018 fr.ISO8859-15 -> fr lrwxrwxrwx 1 root root 2 Oct 13 2018 fr.UTF-8 -> fr drwxr-xr-x 3 root other 4 Oct 13 2018 it lrwxrwxrwx 1 root root 2 Oct 13 2018 it_IT -> it lrwxrwxrwx 1 root root 2 Oct 13 2018 it_IT.ISO8859-1 -> it lrwxrwxrwx 1 root root 2 Oct 13 2018 it_IT.ISO8859-15 -> it lrwxrwxrwx 1 root root 2 Oct 13 2018 it_IT.UTF-8 -> it lrwxrwxrwx 1 root root 2 Oct 13 2018 it.ISO8859-15 -> it lrwxrwxrwx 1 root root 2 Oct 13 2018 it.UTF-8 -> it drwxr-xr-x 3 root other 4 Oct 13 2018 ja lrwxrwxrwx 1 root root 2 Oct 13 2018 ja_JP.eucJP -> ja lrwxrwxrwx 1 root root 2 Oct 13 2018 ja_JP.PCK -> ja lrwxrwxrwx 1 root root 2 Oct 13 2018 ja_JP.UTF-8 -> ja drwxr-xr-x 3 root other 4 Oct 13 2018 ko lrwxrwxrwx 1 root root 2 Oct 13 2018 ko_KR.EUC -> ko lrwxrwxrwx 1 root root 2 Oct 13 2018 ko_KR.UTF-8 -> ko lrwxrwxrwx 1 root root 2 Oct 13 2018 ko.UTF-8 -> ko drwxr-xr-x 3 root other 4 Oct 13 2018 pt drwxr-xr-x 3 root other 4 Oct 13 2018 pt_BR lrwxrwxrwx 1 root root 5 Oct 13 2018 pt_BR.ISO8859-1 -> pt_BR drwxr-xr-x 3 root other 3 Oct 13 2018 pt_BR.us-ascii lrwxrwxrwx 1 root root 5 Oct 13 2018 pt_BR.UTF-8 -> pt_BR lrwxrwxrwx 1 root root 2 Oct 13 2018 pt.ISO8859-15 -> pt drwxr-xr-x 3 root other 3 Oct 13 2018 pt.us-ascii lrwxrwxrwx 1 root root 5 Oct 13 2018 zh -> zh_CN drwxr-xr-x 3 root other 4 Oct 13 2018 zh_CN lrwxrwxrwx 1 root root 5 Oct 13 2018 zh_CN.EUC -> zh_CN lrwxrwxrwx 1 root root 5 Oct 13 2018 zh_CN.GB18030 -> zh_CN lrwxrwxrwx 1 root root 5 Oct 13 2018 zh_CN.GBK -> zh_CN lrwxrwxrwx 1 root root 5 Oct 13 2018 zh_CN.UTF-8 -> zh_CN drwxr-xr-x 3 root other 4 Oct 13 2018 zh_TW lrwxrwxrwx 1 root root 5 Oct 13 2018 zh_TW.BIG5 -> zh_TW lrwxrwxrwx 1 root root 5 Oct 13 2018 zh_TW.EUC -> zh_TW lrwxrwxrwx 1 root root 5 Oct 13 2018 zh_TW.UTF-8 -> zh_TW lrwxrwxrwx 1 root root 5 Oct 13 2018 zh.GBK -> zh_CN lrwxrwxrwx 1 root root 5 Oct 13 2018 zh.UTF-8 -> zh_CN