Hi Kevin, Kurt,

On 2026-06-19T03:25:57-0400, Kurt Hackenberg wrote:
> On Fri, Jun 19, 2026 at 14:50 +0800, Kevin J. McCarthy wrote:
> 
> > Mutt uses its own ascii_* functions because the built-in ones are
> > problematic for some locales.  The isspace() has weird issues too...
> 
> Ah.

In shadow-utils we do something similar; we call them with a shorter _c
suffix meaning that they work unconditionally with the C locale.

I re-implemented isspace_c() et al. recently, quite compactly:

        #define CTYPE_CNTRL_C                                                 \
                "\x7F"                                                        \
                
"\x1F\x1E\x1D\x1C\x1B\x1A\x19\x18\x17\x16\x15\x14\x13\x12\x11\x10" \
                "\x0F\x0E\x0D\x0C\x0B\x0A\x09\x08\x07\x06\x05\x04\x03\x02\x01" 
/*NUL*/

        #define CTYPE_LOWER_C     "abcdefghijklmnopqrstuvwxyz"
        #define CTYPE_UPPER_C     "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
        #define CTYPE_DIGIT_C     "0123456789"
        #define CTYPE_PUNCT_C     "!\"#$%&'()*+,-./:;<=>?@[\\]^_`{|}~"
        #define CTYPE_SPACE_C     " \t\n\v\f\r"
        #define CTYPE_ALPHA_C     CTYPE_LOWER_C CTYPE_UPPER_C
        #define CTYPE_ALNUM_C     CTYPE_ALPHA_C CTYPE_DIGIT_C
        #define CTYPE_GRAPH_C     CTYPE_ALNUM_C CTYPE_PUNCT_C
        #define CTYPE_PRINT_C     CTYPE_GRAPH_C " "
        #define CTYPE_XDIGIT_C    CTYPE_DIGIT_C "abcdefABCDEF"
        #define CTYPE_ASCII_C     CTYPE_PRINT_C CTYPE_CNTRL_C /*NUL*/

        // isascii_c - is [:ascii:] C-locale
        #define isascii_c(c)      (!!strchr(CTYPE_ASCII_C, c))
        #define iscntrl_c(c)      (!!strchr(CTYPE_CNTRL_C, c))
        #define islower_c(c)      (!streq(strchrnul(CTYPE_LOWER_C, c), ""))
        #define isupper_c(c)      (!streq(strchrnul(CTYPE_UPPER_C, c), ""))
        #define isdigit_c(c)      (!streq(strchrnul(CTYPE_DIGIT_C, c), ""))
        #define ispunct_c(c)      (!streq(strchrnul(CTYPE_PUNCT_C, c), ""))
        #define isspace_c(c)      (!streq(strchrnul(CTYPE_SPACE_C, c), ""))
        #define isalpha_c(c)      (!streq(strchrnul(CTYPE_ALPHA_C, c), ""))
        #define isalnum_c(c)      (!streq(strchrnul(CTYPE_ALNUM_C, c), ""))
        #define isgraph_c(c)      (!streq(strchrnul(CTYPE_GRAPH_C, c), ""))
        #define isprint_c(c)      (!streq(strchrnul(CTYPE_PRINT_C, c), ""))
        #define isxdigit_c(c)     (!streq(strchrnul(CTYPE_XDIGIT_C, c), ""))

        // strisascii_c - string is [:ascii:] C-locale
        #define strisdigit_c(s)   streq(stpspn(s, CTYPE_DIGIT_C), "")
        #define strisprint_c(s)   streq(stpspn(s, CTYPE_PRINT_C), "")

        // strchriscntrl_c - string character is [:cntrl:] C-locale
        #define strchriscntrl_c(s)  (!!strpbrk(s, CTYPE_CNTRL_C))

> > We could change the retval of the function to true/false and use the

I agree with returning a boolean for a match --that's a better API--,
but then please rename the function to ...eq() instead of ...cmp(), and
also invert the meaning, so that true means there's a match.

See the manual page for streq(3) for example, for prior art in an
existing libc.
<https://www.man7.org/linux/man-pages/man3/streq.3.html>

> > ascii_toupper() and IS_ASCII_WS() function in your loop above.  Or keep
> > the comparison and just do something like:
> > 
> >  size_t a_len, b_len;
> > 
> >  a_len = mutt_strlen(a);
> >  b_len = b ? strcspn(b, " \t\n\v\f\r") : 0;
> >  return ascii_strncasecmp(a, b, MAX(a_len, b_len));

LGTM; this should also fix the truncation latent bug.  This is the most
readable thing I've seen so far.  :)


Cheers,
Alex

> 
> Either should be OK; I don't care much. Both improve on what's there now.

-- 
<https://www.alejandro-colomar.es>

Attachment: signature.asc
Description: PGP signature

Reply via email to