On Thu, Mar 17, 2016 at 05:27:49PM -0700, Kevin J. McCarthy wrote:
> On Thu, Mar 17, 2016 at 11:36:23PM +0000, Richard Russon wrote:
> > This is Karel Zak's patch to fix handling of (illegal) multi-byte chars.
> I need some time to look at this before I push it. If any of the other
memset (&mbstate, 0, sizeof (mbstate));
for (w = 0; n && (cl = mbrtowc (&wc, src, n, &mbstate)); src += cl, n -= cl)
{
- if (cl == (size_t)(-1) || cl == (size_t)(-2))
+ if (cl == (size_t)(-1) || cl == (size_t)(-2)) {
cw = cl = 1;
+ memset(&mbstate, 0, sizeof (mbstate));
+ }
To save you a little time:
man mbrtowc(3) states:
*mbstate must be a valid mbstate_t object. An mbstate_t object a
can be initialized to the initial state by zeroing it, for example
using
memset(&a, 0, sizeof(a));
If the multibyte string ... contains an invalid multibyte sequence
... the effects on *mbstate are undefined.
This implies that after an error we should reset mbstate.
I can trigger this to happen.
cw = wcwidth (wc);
if (cw < 0 && cl == 1 && src[0] && src[0] < M_TREE_MAX)
cw = 1;
+ else if (cw < 0)
+ cw = 0; /* unprintable wchar */
}
if (cl + l > maxlen || cw + w > maxwid)
break;
I don't know what would cause mbrtowc() to succeed, but then wcwidth()
to fail. However, as cw is one of our length offsets, we really don't
want it to be negative.
w += cw;
Rich / FlatCap