It turns out that fs/nls/nls_base.c already includes conversion
routines such as utf8_wcstombs(). Unforunately they are not an ideal
match to what we want for several reasons:
They work with wchar_t, typedef'd as __u16 in
include/linux/nls.h. Hence they expect to see wide-characters
stored in native byte order only.
They don't handle surrogate pairs.
The string conversion routines just skip over characters they
can't handle instead of returning an error.
The string routines stop as soon as they encounter a NUL.
They don't allow the input size to be specified.
There's a bug in utf8_wcstombs(): The remaining output buffer
size isn't decremented when a single-byte conversion is done.
And it _is_ decremented for no apparent reason when an invalid
character is skipped over.
There are no comments or documentation explaining the meaning
of the functions' arguments.
The nls module includes lots of other static array storage; it
would be preferable if USB didn't have to force all that stuff
to be loaded.
A few other source files, such as drivers/s390/char/keyboard.c, include
their own home-brewed UTF-8 conversions. In some cases the converted
characters aren't stored in a buffer; they are passed to various inline
routines. In no cases are surrogate pairs handled correctly.
In fact, as far as I can see no part of the kernel is prepared to
handle Unicode values higher than U+ffff.
So the situation is a mess. I'm not even sure what features a library
ought to provide. Something from the following selection:
Store output in a buffer or send it to an output routine.
Native byte-order, little-endian, or big-endian.
Return an error for invalid codes or ignore them.
Stop at NUL or take a length argument.
Any suggestions for the best way to organize all this?
Alan Stern
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
[email protected]
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel