https://sourceware.org/git/gitweb.cgi?p=newlib-cygwin.git;h=ada3b8f7e5a71db4881135380c407a86a08f6663
commit ada3b8f7e5a71db4881135380c407a86a08f6663 Author: Corinna Vinschen <cori...@vinschen.de> AuthorDate: Wed Jul 23 22:42:52 2025 +0200 Commit: Corinna Vinschen <cori...@vinschen.de> CommitDate: Thu Jul 24 11:23:27 2025 +0200 Cygwin: _sys_wcstombs: add FIXME comment Add a FIXME comment to the conversion of private use area bytes to "normal" bytes in _sys_wcstombs detailing the conversion difference between _sys_wcstombs and standard wcstombs. It might be a good idea to drop the conversion to gather compatibility with wcstombs. Signed-off-by: Corinna Vinschen <cori...@vinschen.de> Diff: --- winsup/cygwin/strfuncs.cc | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/winsup/cygwin/strfuncs.cc b/winsup/cygwin/strfuncs.cc index caaf6b786295..eb6576051d90 100644 --- a/winsup/cygwin/strfuncs.cc +++ b/winsup/cygwin/strfuncs.cc @@ -965,8 +965,17 @@ _sys_wcstombs (char *dst, size_t len, const wchar_t *src, size_t nwc, /* Convert UNICODE private use area. Reverse functionality for the ASCII area <= 0x7f (only for path names) is transform_chars above. + Reverse functionality for invalid bytes in a multibyte sequence is - in _sys_mbstowcs below. */ + in _sys_mbstowcs below. + + FIXME? The conversion of invalid bytes from the private use area + like we do here is not actually necessary. If we skip it, the + generated multibyte string is not identical to the original multibyte + string, but it's equivalent in the sense, that another mbstowcs will + generate the same wide-char string. It would also be identical to + the same string converted by wcstombs. And while the original + multibyte string can't be converted by mbstowcs, this string can. */ if (is_path && (pw & 0xff00) == 0xf000 && (((cwc = (pw & 0xff)) <= 0x7f && tfx_rev_chars[cwc] >= 0xf000) || (cwc >= 0x80 && MB_CUR_MAX > 1)))