Hello, In several systems, you can ask for the version of a locale's definition. This can be important for persistent data structures that live long enough for the locale definition to change, but depend on its stability. Concretely, I mean things like on-disk btree indexes in databases that are ordered using strcoll_l(), when natural language ordering is desired.
http://www.unicode.org/reports/tr10/ says: "Over time, collation order will vary: there may be fixes needed as more information becomes available about languages; there may be new government or industry standards for the language that require changes; and finally, new characters added to the Unicode Standard will interleave with the previously-defined ones. This means that collations must be carefully versioned." For this reason, many database systems avoid using operating system locale support, but that has other downsides including disagreeing with other software on the same system. While a system locale implementation could conceivably offer a way to open different versions of a locale explicitly or through the file system or environment, I think it would be good at a minimum for an application to have a standard way to know if the definition of an existing locale opened by name has changed. Several locale APIs expose this information already: https://man.freebsd.org/cgi/man.cgi?query=querylocale&sektion=3&format=html https://learn.microsoft.com/en-us/windows/win32/api/winnls/nf-winnls-getnlsversionex https://unicode-org.github.io/icu-docs/apidoc/dev/icu4c/ucol_8h.html For example, PostgreSQL (which can use POSIX, Windows or ICU locales) checks these values to see if they have changed unexpectedly, and complains that affected indexes must be rebuilt if so. This usually happens after an operating system upgrade or migration to a different computer. Not doing so can result in data corruption, if btree traversals take wrong turns. In a hypothetical standard API, the values returned could be left unspecified, only to be used to compare for equality with an earlier stored value. That is the case with the above-mentioned systems. In practice, they combine elements like the CLDR version, Unicode version etc. Getting the information out: Since POSIX 2024 has standardised getlocalename_l(), which is approximately the same as querylocale() (found on macOS and the BSDs), FreeBSD's querylocale() extension LC_VERSION_MASK wouldn't make much sense as a proposal. Other ideas include: 1. nl_langinfo_l(LC_LOCALE_VERSION(category), loc) Inspired by glibc's non-standard nl_langinfo_l(LC_LOCALE_NAME(category), loc), which is like getlocalename_l(category, loc). Hammering a category into an nl_item parameter with a function-like macro is perhaps a little unusual. 2. getlocaleversion_l(category, loc) Inspired by standard getlocalename_l(category, loc). Takes a category explicitly, which seems a little more natural to me, but it also creates a new function name. Getting the information in: The localedef locale definition source format could potentially define syntax for providing the version string, but I haven't studied this part yet. (FreeBSD's localedef currently has a -V switch to provide a version string, and the locales in the base system are compiled with that set to the Unicode CLDR version of the source data for LC_COLLATE.) I'd love to hear any feedback on the general idea, or relevant systems I may be missing, before trying to propose something more concrete. Thanks for reading, Thomas Munro