Eli Zaretskii wrote:
> >   * Alternatively, you could create an extended copy of gnulib/lib/mbchar.h,
> >     defining an abstract "multibyte character" that is UTF-8 encoded in case
> >     (U) and locale encoded in case (L), i.e. depending on a global variable.
> >     And then, an equally extended copy of gnulib/lib/mbiter.h, defining the
> >     iterator over such multibyte characters.
> 
> Yes.
> 
> It's up to you as a Gnulib developer, but I tend to think that ... maybe
> in the long run Gnulib should add the above-mentioned extensions to
> its mbchar.h and mbiter.h.

If there was large demand for it, I would do that. But I think most
programs use _either_ strings are represented in locale encoding _or_
Unicode strings, for the reason explained in [1]. The current 'info'
reader is quite particular in working on strings in locale encoding
sometimes and on Unicode strings sometimes, in the same places.

> as the
> Windows runtime doesn't support well (or not at all) characters beyond
> the BMP, replacing its standard C functions in Gnulib with versions
> that accept char32_t codepoints

This is an effort that is already completed in Gnulib: Since we can't
redefine the 'wchar_t' type, we had to create another set of functions
that work on char32_t[] strings. [2][3] A couple of GNU packages already
make use of it, so as to support characters outside the BMP correctly
on Cygwin and native Windows.

> and paying attention to the console's
> output codepage rather than the system locale's codeset

This is a request of the past. For a couple of years now, the output
functions in the Microsoft runtime library have automatic conversion
from the locale encoding to the console's output codepage (e.g.
from CP1252 to CP850). [4] There is no need any more to care about
this difference in Gnulib or in GNU packages, except for the workarounds
mentioned in [4].

Bruno

[1] 
https://www.gnu.org/software/libunistring/manual/html_node/In_002dmemory-representation.html
[2] 
https://www.gnu.org/software/gnulib/manual/html_node/Comparison-of-string-APIs.html
[3] 
https://www.gnu.org/software/gnulib/manual/html_node/Comparison-of-character-APIs.html
[4] https://lists.gnu.org/archive/html/bug-gnulib/2025-09/msg00199.html




  • MinG... Eli Zaretskii
    • ... Bruno Haible via Bug reports for the GNU Texinfo documentation system
      • ... Eli Zaretskii
        • ... Bruno Haible via Bug reports for the GNU Texinfo documentation system
          • ... Eli Zaretskii
            • ... Bruno Haible via Bug reports for the GNU Texinfo documentation system
              • ... Eli Zaretskii
          • ... Gavin Smith
            • ... Eli Zaretskii
              • ... Gavin Smith

Reply via email to