Re: [PATCH] reset mbstate on error

Kevin J. McCarthy Sat, 19 Mar 2016 19:48:54 -0700

On Fri, Mar 18, 2016 at 12:40:47PM +0000, Richard Russon wrote:
> On Thu, Mar 17, 2016 at 05:27:49PM -0700, Kevin J. McCarthy wrote:
> > On Thu, Mar 17, 2016 at 11:36:23PM +0000, Richard Russon wrote:
> > > This is Karel Zak's patch to fix handling of (illegal) multi-byte chars.
> > I need some time to look at this before I push it.  If any of the other
> 
>    memset (&mbstate, 0, sizeof (mbstate));
>    for (w = 0; n && (cl = mbrtowc (&wc, src, n, &mbstate)); src += cl, n -= 
> cl)
>    {
> -    if (cl == (size_t)(-1) || cl == (size_t)(-2))
> +    if (cl == (size_t)(-1) || cl == (size_t)(-2)) {
>        cw = cl = 1;
> +      memset(&mbstate, 0, sizeof (mbstate));
> +    }


I wonder why this function is using type int for n, l, and cl instead of
size_t, like mutt_paddstr() and mutt_charlen() just above and below.
The return value of the function is int, but all the callers are
assigning the retval to a size_t.  Seems like this should be fixed too.

The patch appears fine in that it does appear we should be reseting the
mbstate for cl == (size_t)-1, and don't want cw<0 when we go to
"increment" w.

But I'm wondering about the behavior for cl == (size_t)-2.  This
indicates the rest of the string doesn't contain a complete multibyte
character.  Why doesn't it set 'cl = n; cw = 0;' or just break out of
the loop instead?  Instead it looks like it will count each byte left as
an l and a w.

I'll push the patch tomorrow (or when I get back after the conference if
tomorrow is too busy), and I'll push another patch changing the int to
size_t.  I'm not feeling brave enough to change the behavior for cl==-2
this close to release, so probably will let that one lie for now.

-- 
Kevin J. McCarthy
GPG Fingerprint: 8975 A9B3 3AA3 7910 385C  5308 ADEF 7684 8031 6BDA
http://www.8t8.us/configs/gpg-key-transition-statement.txt

signature.asc
Description: PGP signature

Re: [PATCH] reset mbstate on error

Reply via email to