On Mon, Aug 18, 2025 at 06:40:04PM +0200, Walter Alejandro Iglesias wrote:

> Question for the experts.  Let's take the following example:
> 
> ----->8------------->8--------------------
> #include <stdio.h>
> #include <string.h>
> #include <wchar.h>
> 
> #define period                0x2e
> #define question      0x3f
> #define exclam                0x21
> #define ellipsis      L'\u2026'
> 
> const wchar_t p[] = { period, question, exclam, ellipsis };

This is not a string, as is is not NUL terminated, so there's garbage
after it, which will be picked up by wcsspn() until it hits a NUL.

Declaring it as const wchar_t p[5] = { period, question, exclam, ellipsis } 

and/or initing it as { period, question, exclam, ellipsis, '\0' }

should work.

        -Otto

> 
> int
> main()
> {
>       const wchar_t s[] = L". Hello.";
> 
>       printf("%ls\n", s);
>       printf("%lu\n", wcsspn(s, p));
> 
>       return 0;
> }
> -------------8<-----------8<----------------
> 
> 
> Now run:
> 
>   $ cc -Wall example.c -o example && ./example
>   . Hello.
>   8
>   $ egcc -Wall example.c -o example && ./example
>   . Hello.
>   1
> 
> As you see, compiled with GCC the program does what is expected.  To get
> the desired result with CLANG you have to write the string literally.
> Change the declaration of p[] above to:
> 
>   const wchar_t p[] = L".?!?";
>                            ^ This is a UTF-8 ellipsis.
> 
> And now:
> 
>   $ cc -Wall example.c -o example && ./example
>   . Hello.
>   1
> 
> Using only ASCII or only UTF-8 in the array also works.
> 
> Is this a bug in clang's wcsspn() or I'm wrong in assuming that the
> array can be declared in the way I did?
> 
> 
> -- 
> Walter
> 

Reply via email to