Corinna Vinschen wrote in
> it would be
> pretty nice if that code could get reverted back in to support
> non-BMP charsets even on Cygwin.

I agree that support for beyond-BMP characters should be added back to 'grep'.

Your earlier fix from 2013-08-16 (and the fact that the test failure is
occurring exactly on Windows and AIX platforms) shows that the problem is
with wchar_t being only 16-bit wide on these platforms.

The type 'char32_t' has been introduced in C11 to overcome this limitation.[1]

I propose to

  1) introduce in gnulib support for <uchar.h>, char32_t, and mbrtoc32, so
     that we can use these instead of <wchar.h>, wchar_t, and mbrtowc

  2) change those gnulib modules that don't behave well with beyond-BMP
     characters on Windows and AIX to use char32_t instead of wchar_t.

Then the 'grep' code can be changed in a similar way, and this will
fix the bug on Cygwin and AIX (though not on native Windows [2]).

The advantage of this approach are minimal code changes in 'grep': just
change some type and function names here and there, and add code for
the additional (size_t)(-3) return value of mbrtoc32.



Reply via email to