On Tue, Feb 17, 2015 at 9:30 PM, Ulf Magnusson <[email protected]> wrote: > On Tue, Feb 17, 2015 at 9:28 PM, Ulf Magnusson <[email protected]> wrote: >> Thanks for the feedback! >> >> The -1 comparison should be safe in practice on non-exotic systems >> where the size (rank) of size_t is at least that of int, but yeah, >> it's kinda pointless and stupid to leave out the cast. >> >> I think I'll roll the mbrtowc -2 case into the error case as wc_len < >> 0 for now. It'd be weird MB_CUR_MAX gave -2, but it's worth checking >> for at least. >> >> I added control character handling by doing the following btw: >> >> width += iswcntrl(wc) ? 2 : max(0, wcwidth(wc)); >> >> Guess that might catch more characters than it should though. >> >> I also noticed that readline outputs things like "~Z" for some (meta?) >> characters. Might want to get back to that later... >> >> /Ulf > > (Excuse the top-posting by the way. Gmail keeps tripping me up. :P) > > /Ulf
(wc_len < 0 would not work of course. What I really meant was to handle the -2 case the same as the -1 case.) /Ulf > >> >> On Tue, Feb 17, 2015 at 5:44 PM, Chet Ramey <[email protected]> wrote: >>> On 2/16/15 4:52 PM, Ulf Magnusson wrote: >>>> On Mon, Feb 16, 2015 at 4:43 PM, Ulf Magnusson <[email protected]> wrote: >>>>> I'll try it. Thanks for the suggestion! >>>>> >>>>> /Ulf >>>>> >>>> >>>> Here's what I came up with in case someone else runs into the same >>>> problem. I'm sure there's more stuff to handle (not sure what to do >>>> for non-printable characters for example), but it seems to handle >>>> multibyte (tested using åäö's and Chinese) and combining characters >>>> correctly for UTF-8 at least: >>> >>> This is basically what an implementation of wcswidth looks like. A couple >>> of suggestions: >>> >>>> // Returns the total width (in columns) of the characters in the 'n'-byte >>>> // prefix of the null-terminated multibyte string 's'. If 'n' is larger >>>> than >>>> // 's', returns the total width of the string. Suitable for calculating a >>>> // cursor position. >>>> // >>>> // Makes a guess for malformed strings. >>>> static size_t strnwidth(const char *s, size_t n) { >>>> mbstate_t shift_state; >>>> wchar_t wc; >>>> size_t wc_len; >>>> size_t width = 0; >>>> >>>> // Start in the initial shift state. >>>> memset(&shift_state, '\0', sizeof shift_state); >>>> >>>> for (size_t i = 0; i < n; i += wc_len) { >>>> // Extract the next multibyte character. >>>> wc_len = mbrtowc(&wc, s + i, MB_CUR_MAX, &shift_state); >>>> if (wc_len == 0) >>>> // Reached the end of the string. >>>> break; >>>> if (wc_len == -1) >>> >>> wc_len is a size_t, which is usually unsigned. You need to cast the -1 >>> to (size_t)-1. You also need to handle mbrtowc returning (size_t)-2. >>> >>> >>> -- >>> ``The lyf so short, the craft so long to lerne.'' - Chaucer >>> ``Ars longa, vita brevis'' - Hippocrates >>> Chet Ramey, ITS, CWRU [email protected] http://cnswww.cns.cwru.edu/~chet/ _______________________________________________ Bug-readline mailing list [email protected] https://lists.gnu.org/mailman/listinfo/bug-readline
