Hi Kevin, Kurt, On 2026-06-19T03:25:57-0400, Kurt Hackenberg wrote: > On Fri, Jun 19, 2026 at 14:50 +0800, Kevin J. McCarthy wrote: > > > Mutt uses its own ascii_* functions because the built-in ones are > > problematic for some locales. The isspace() has weird issues too... > > Ah.
In shadow-utils we do something similar; we call them with a shorter _c
suffix meaning that they work unconditionally with the C locale.
I re-implemented isspace_c() et al. recently, quite compactly:
#define CTYPE_CNTRL_C \
"\x7F" \
"\x1F\x1E\x1D\x1C\x1B\x1A\x19\x18\x17\x16\x15\x14\x13\x12\x11\x10" \
"\x0F\x0E\x0D\x0C\x0B\x0A\x09\x08\x07\x06\x05\x04\x03\x02\x01"
/*NUL*/
#define CTYPE_LOWER_C "abcdefghijklmnopqrstuvwxyz"
#define CTYPE_UPPER_C "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
#define CTYPE_DIGIT_C "0123456789"
#define CTYPE_PUNCT_C "!\"#$%&'()*+,-./:;<=>?@[\\]^_`{|}~"
#define CTYPE_SPACE_C " \t\n\v\f\r"
#define CTYPE_ALPHA_C CTYPE_LOWER_C CTYPE_UPPER_C
#define CTYPE_ALNUM_C CTYPE_ALPHA_C CTYPE_DIGIT_C
#define CTYPE_GRAPH_C CTYPE_ALNUM_C CTYPE_PUNCT_C
#define CTYPE_PRINT_C CTYPE_GRAPH_C " "
#define CTYPE_XDIGIT_C CTYPE_DIGIT_C "abcdefABCDEF"
#define CTYPE_ASCII_C CTYPE_PRINT_C CTYPE_CNTRL_C /*NUL*/
// isascii_c - is [:ascii:] C-locale
#define isascii_c(c) (!!strchr(CTYPE_ASCII_C, c))
#define iscntrl_c(c) (!!strchr(CTYPE_CNTRL_C, c))
#define islower_c(c) (!streq(strchrnul(CTYPE_LOWER_C, c), ""))
#define isupper_c(c) (!streq(strchrnul(CTYPE_UPPER_C, c), ""))
#define isdigit_c(c) (!streq(strchrnul(CTYPE_DIGIT_C, c), ""))
#define ispunct_c(c) (!streq(strchrnul(CTYPE_PUNCT_C, c), ""))
#define isspace_c(c) (!streq(strchrnul(CTYPE_SPACE_C, c), ""))
#define isalpha_c(c) (!streq(strchrnul(CTYPE_ALPHA_C, c), ""))
#define isalnum_c(c) (!streq(strchrnul(CTYPE_ALNUM_C, c), ""))
#define isgraph_c(c) (!streq(strchrnul(CTYPE_GRAPH_C, c), ""))
#define isprint_c(c) (!streq(strchrnul(CTYPE_PRINT_C, c), ""))
#define isxdigit_c(c) (!streq(strchrnul(CTYPE_XDIGIT_C, c), ""))
// strisascii_c - string is [:ascii:] C-locale
#define strisdigit_c(s) streq(stpspn(s, CTYPE_DIGIT_C), "")
#define strisprint_c(s) streq(stpspn(s, CTYPE_PRINT_C), "")
// strchriscntrl_c - string character is [:cntrl:] C-locale
#define strchriscntrl_c(s) (!!strpbrk(s, CTYPE_CNTRL_C))
> > We could change the retval of the function to true/false and use the
I agree with returning a boolean for a match --that's a better API--,
but then please rename the function to ...eq() instead of ...cmp(), and
also invert the meaning, so that true means there's a match.
See the manual page for streq(3) for example, for prior art in an
existing libc.
<https://www.man7.org/linux/man-pages/man3/streq.3.html>
> > ascii_toupper() and IS_ASCII_WS() function in your loop above. Or keep
> > the comparison and just do something like:
> >
> > size_t a_len, b_len;
> >
> > a_len = mutt_strlen(a);
> > b_len = b ? strcspn(b, " \t\n\v\f\r") : 0;
> > return ascii_strncasecmp(a, b, MAX(a_len, b_len));
LGTM; this should also fix the truncation latent bug. This is the most
readable thing I've seen so far. :)
Cheers,
Alex
>
> Either should be OK; I don't care much. Both improve on what's there now.
--
<https://www.alejandro-colomar.es>
signature.asc
Description: PGP signature
