On Thu, May 23, 2024 at 10:25 AM Chet Ramey <chet.ra...@case.edu> wrote:
>
> On 5/21/24 2:42 PM, Grisha Levit wrote:
> > Avoid using (size_t)-1 as an offset.
>
> I can't reproduce this on macOS. Where is the code that's using -1 as an
> offset?

The loop in rl_change_case does the following:

    rl_change_case(count=-1, op=2) at text.c:1483:9
       1481   while (start < end)
       1482     {
    -> 1483       c = _rl_char_value (rl_line_buffer, start);

    _rl_char_value(buf="\xc0", ind=0) at mbutil.c:493:23
       491    l = strlen (buf);
       492    if (ind + 1 >= l)
    -> 493      return ((WCHAR_T) buf[ind]);

    (wchar_t) c = L'À'

This seems questionable since a string consisting of \xC0, and a string
actually representing \u00C0 (\xC3\x80) will both return the same thing.


The next check passes, since C is LATIN CAPITAL LETTER A WITH GRAVE

    rl_change_case(count=-1, op=2) at text.c:1487:28
    -> 1487       if (_rl_walphabetic (c) == 0)
       1488         {
       1489           inword = 0;
       1490           start = next;
       1450           continue;

    _rl_walphabetic(wc=L'À') at util.c:89:5
       88     if (iswalnum (wc))
    -> 89       return (1);


So we call mbrtowc on the same string position and since this is not a
valid multibyte character, (size_t)-1 is stored in M.

    rl_change_case(count=-1, op=2) at text.c:1512:22
    -> 1512           m = MBRTOWC (&wc, rl_line_buffer + start, end - start, 
&mps);

    (size_t) m = 18446744073709551615


Then we again interpret \xC0 as if it were \u00C0:

    rl_change_case(count=-1, op=2) at text.c:1514:20
       1513           if (MB_INVALIDCH (m))
    -> 1514             wc = (WCHAR_T)rl_line_buffer[start];

    (wchar_t) wc = L'À'


And lowercase that character, storing its length in MLEN.

    rl_change_case(count=-1, op=2) at text.c:1517:11
    -> 1517           nwc = (nop == UpCase) ? _rl_to_wupper (wc) : 
_rl_to_wlower (wc);

    rl_change_case(count=-1, op=2) at text.c:1524:28
    -> 1524               mlen = WCRTOMB (mb, nwc, &ts);

    (wchar_t) nwc = L'à'
    (int) mlen = 2


Since WC and NWC are different, and M (being (size_t)-1) is greater than MLEN:

    rl_change_case(count=-1, op=2) at text.c:1544:13
       1541               else if (m > mlen)
       1542                 {
       1543                   memcpy (s, mb, mlen);
    -> 1544                   memmove (s + mlen, s + m, (e - s) - m);

So the second arg to memmove is a pointer one behind S.

Reply via email to