Pablo Saratxaga wrote on 2001-05-10 15:50 UTC:
> Btw, I think that a number of CJK displaying problems I keep saying
> are linked to the fact that strlen() doesn't work for non 8bit encodings.
>
> There is wcslen() of course; but often the strings are not in wc, but in mb.
> converting mb<->wc adds complexity, and some programmers that don't worry
> too much about i18n won't care about it.
>
> Is there an mbslen()? (that is, a function like wcslen, but applied to
> a mb string; that does any necessary mb<->wc conversion internally).
Sort of:
#define mbslen(s, ps) mbsrtowcs(NULL, &s, SIZE_MAX, ps)
#define mbslen(s) mbsrtowcs(NULL, &s, SIZE_MAX, NULL)
might do the job. (You can't use mbstowcs here unfortunately, because
ISO C 99 doesn't specify that it can be used with pwcs==NULL. :-(( )
Note that these functions return (size_t)(-1) if they run into a
malformed sequence, which I think is a big hassle in practice in
languages without exception handling.
The length of a string matters for two applications:
a) Find out how much memory to allocate. This requires a byte count,
and strlen does exactly what you want, even for multi-byte encodings.
b) Find out, how many columns the cursor will advance if a string is sent
to a terminal. For wide strings, we have here wcswidth, but for
multibyte strings, there is no standardized convenient alternative.
I don't think, you want to replace strlen with mbslen very frequently!
The thing that I *REALLY* miss is the multi-byte version of wcwidth and
wcswidth:
mbwidth column width of one multi-byte character
mbswidth column width of a multi-byte string
It would be up to X/Open to add these, because ISO C has decided that it
doesn't want to be responsible with character terminal width
information.
Can mbwidth/mbswidth still be squeezed into the currently being
finalized POSIX/SUS merger specification?
Markus
--
Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK
Email: mkuhn at acm.org, WWW: <http://www.cl.cam.ac.uk/~mgk25/>
-
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/lists/