Corinna Vinschen wrote:
On Jun 29 19:13, Christian Franke wrote:
Fixes the CESU-8 value, but not the missing encoding if the high surrogate
is at the very end of the string.
Are you going to provide a patch for that issue?
Not very soon as this possibly requires non-trivial rework including
comprehensive testing.
The function behind __WCTOMB() must also be called with the final L'\0'
as input. This is not the case. For example in _wcstombs_r() only the
second __WCTOMB() is called with L'\0'. The (s == NULL) part implicitly
assumes that it would only append '\0' and return 1.
newlib/libc/stdlib/wctomb_r.c:
size_t
_wcstombs_r (...)
{
...
if (s == NULL)
{
...
while (*pwcs != 0)
{
bytes = __WCTOMB (r, buff, *pwcs++, state);
...
num_bytes += bytes;
}
return num_bytes;
}
else
{
while (n > 0)
{
bytes = __WCTOMB (r, buff, *pwcs, state);
...
if (*pwcs == 0x00)
return ptr - s - (n >= bytes);
...
}
...
}
}
...
+ tmp = (((state->__value.__wchb[0] << 16 | state->__value.__wchb[1] << 8)
+ - 0x10000) >> 10) | 0xd800;
*s++ = 0xe0 | ((tmp & 0xf000) >> 12);
*s++ = 0x80 | ((tmp & 0xfc0) >> 6);
*s++ = 0x80 | (tmp & 0x3f);
--
2.45.1
LGTM, please push.
Done.
--
Thanks,
Christian