Michael B Allen writes:

> What's the ultimate goal here? Are any of these functions *supposed*
> to work on multi-byte characters, or will there be mbs* functions?

strcpy strcat strdup
    already work for multi-byte characters

strncpy strncat strncmp
    cannot work for multi-byte characters because they truncate
    characters

strcspn strspn strpbrk strstr
    you can write multibyte aware analogs of these

strchr strrchr
    use a multibyte aware strstr analog instead

Nothing is standardized in this area, but IMO an <mbstring.h> include
file which defines these for arbitrary encodings, and an <unistring.h>
which defines these for UTF-8 strings, would be very nice. I'm working
on an LGPL'ed implementation of the latter.

> /*
>  * Returns a pointer to the character at off withing the multi-byte string
                                        ^^^^^^
Emphasize: at _screen_position_ off.

>  * src not examining more than sn bytes.
>  */
> char *
> mbsnoff(char *src, int off, size_t sn)
> {
>     unsigned long ucs;
>     int w;  
>     size_t n;
>     mbstate_t ps;
> 
>     ucs = 1;
>     memset(&ps, 0, sizeof(ps));
> 
>     if (sn > INT_MAX) {
>         sn = INT_MAX;
>     }
>     if (off < 0) {
>         off = INT_MAX;
>     }
> 
>     while (ucs && (n = mbrtowc(&ucs, src, sn, &ps)) != (size_t)-2) {

Change that to:

      while (sn > 0 && (n = mbrtowc(&ucs, src, sn, &ps)) != (size_t)-2) {

>         if (n == (size_t)-1) {
>             return NULL;
>         }
>         if ((w = wcwidth(ucs)) > 0) {
>             if (w > off) {
>                 break;
>             }
>             off -= w;
>         }
>         sn -= n;
>         src += n;
>     }
> 
>     return src;
> }

Bruno
--
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Reply via email to