On Thu, Mar 17, 2016 at 05:27:49PM -0700, Kevin J. McCarthy wrote:
> On Thu, Mar 17, 2016 at 11:36:23PM +0000, Richard Russon wrote:
> > This is Karel Zak's patch to fix handling of (illegal) multi-byte chars.
> I need some time to look at this before I push it.  If any of the other

   memset (&mbstate, 0, sizeof (mbstate));
   for (w = 0; n && (cl = mbrtowc (&wc, src, n, &mbstate)); src += cl, n -= cl)
   {
-    if (cl == (size_t)(-1) || cl == (size_t)(-2))
+    if (cl == (size_t)(-1) || cl == (size_t)(-2)) {
       cw = cl = 1;
+      memset(&mbstate, 0, sizeof (mbstate));
+    }

To save you a little time:

man mbrtowc(3) states:

    *mbstate must be a valid mbstate_t object.  An mbstate_t object a
    can be initialized to the initial state by zeroing it, for example
    using

        memset(&a, 0, sizeof(a));

    If the multibyte string ... contains an invalid multibyte sequence
    ...  the effects on *mbstate are undefined.

This implies that after an error we should reset mbstate.
I can trigger this to happen.

       cw = wcwidth (wc);
       if (cw < 0 && cl == 1 && src[0] && src[0] < M_TREE_MAX)
         cw = 1;
+      else if (cw < 0)
+        cw = 0;         /* unprintable wchar */
     }
     if (cl + l > maxlen || cw + w > maxwid)
       break;

I don't know what would cause mbrtowc() to succeed, but then wcwidth()
to fail.  However, as cw is one of our length offsets, we really don't
want it to be negative.

    w += cw;

Rich / FlatCap

Reply via email to