Eli Zaretskii wrote: > > * Alternatively, you could create an extended copy of gnulib/lib/mbchar.h, > > defining an abstract "multibyte character" that is UTF-8 encoded in case > > (U) and locale encoded in case (L), i.e. depending on a global variable. > > And then, an equally extended copy of gnulib/lib/mbiter.h, defining the > > iterator over such multibyte characters. > > Yes. > > It's up to you as a Gnulib developer, but I tend to think that ... maybe > in the long run Gnulib should add the above-mentioned extensions to > its mbchar.h and mbiter.h.
If there was large demand for it, I would do that. But I think most programs use _either_ strings are represented in locale encoding _or_ Unicode strings, for the reason explained in [1]. The current 'info' reader is quite particular in working on strings in locale encoding sometimes and on Unicode strings sometimes, in the same places. > as the > Windows runtime doesn't support well (or not at all) characters beyond > the BMP, replacing its standard C functions in Gnulib with versions > that accept char32_t codepoints This is an effort that is already completed in Gnulib: Since we can't redefine the 'wchar_t' type, we had to create another set of functions that work on char32_t[] strings. [2][3] A couple of GNU packages already make use of it, so as to support characters outside the BMP correctly on Cygwin and native Windows. > and paying attention to the console's > output codepage rather than the system locale's codeset This is a request of the past. For a couple of years now, the output functions in the Microsoft runtime library have automatic conversion from the locale encoding to the console's output codepage (e.g. from CP1252 to CP850). [4] There is no need any more to care about this difference in Gnulib or in GNU packages, except for the workarounds mentioned in [4]. Bruno [1] https://www.gnu.org/software/libunistring/manual/html_node/In_002dmemory-representation.html [2] https://www.gnu.org/software/gnulib/manual/html_node/Comparison-of-string-APIs.html [3] https://www.gnu.org/software/gnulib/manual/html_node/Comparison-of-character-APIs.html [4] https://lists.gnu.org/archive/html/bug-gnulib/2025-09/msg00199.html
