On Mon, Aug 18, 2025 at 06:40:04PM +0200, Walter Alejandro Iglesias wrote: > Question for the experts. Let's take the following example: > > ----->8------------->8-------------------- > #include <stdio.h> > #include <string.h> > #include <wchar.h> > > #define period 0x2e > #define question 0x3f > #define exclam 0x21 > #define ellipsis L'\u2026' > > const wchar_t p[] = { period, question, exclam, ellipsis };
This is not a string, as is is not NUL terminated, so there's garbage after it, which will be picked up by wcsspn() until it hits a NUL. Declaring it as const wchar_t p[5] = { period, question, exclam, ellipsis } and/or initing it as { period, question, exclam, ellipsis, '\0' } should work. -Otto > > int > main() > { > const wchar_t s[] = L". Hello."; > > printf("%ls\n", s); > printf("%lu\n", wcsspn(s, p)); > > return 0; > } > -------------8<-----------8<---------------- > > > Now run: > > $ cc -Wall example.c -o example && ./example > . Hello. > 8 > $ egcc -Wall example.c -o example && ./example > . Hello. > 1 > > As you see, compiled with GCC the program does what is expected. To get > the desired result with CLANG you have to write the string literally. > Change the declaration of p[] above to: > > const wchar_t p[] = L".?!?"; > ^ This is a UTF-8 ellipsis. > > And now: > > $ cc -Wall example.c -o example && ./example > . Hello. > 1 > > Using only ASCII or only UTF-8 in the array also works. > > Is this a bug in clang's wcsspn() or I'm wrong in assuming that the > array can be declared in the way I did? > > > -- > Walter >