Hi Walter, Walter Alejandro Iglesias wrote on Mon, Aug 18, 2025 at 06:40:04PM +0200:
> Question for the experts. Let's take the following example: > > ----->8------------->8-------------------- > #include <stdio.h> > #include <string.h> > #include <wchar.h> > > #define period 0x2e > #define question 0x3f > #define exclam 0x21 > #define ellipsis L'\u2026' > > const wchar_t p[] = { period, question, exclam, ellipsis }; In addition to what otto@ said, this is bad style for more than one reason. First of all, that data type of the constant "0x2e" is "int", see for example C11 6.4.4.1 (Integer constants). Casting "int" to "wchar_t" doesn't really make sense. On OpenBSD, it only works because UTF-8 is the only supported character encoding *and* wchar_t stores Unicode codepoints. But neither of these choices are portable. What you want is (C11 6.4.4.4 Character constants): #define period L'.' #define question L'?' #define exclam L'!' > int > main() > { > const wchar_t s[] = L". Hello."; > > printf("%ls\n", s); > printf("%lu\n", wcsspn(s, p)); The return value of wcsspn(3) is size_t, so this should use %zu. Besides, given that the second argument of wcsspn(3) takes "const wchar_t *", why not simply: const wchar_t *p = L".?!\u2026"; And finally, if you want wchar_t to store UTF-8 strings, you need something like #include <err.h> #include <locale.h> if (setlocale(LC_CTYPE, "C.UTF-8") == NULL) errx(1, "setlocale failed"); Otherwise, the C library function operating on wide strings assume that wchar_t only stores ASCII character numbers. Even printf(3) %ls won't work for UTF-8 characters without setting the locale properly. Yours, Ingo