Markus Kuhn wrote: > In general, the POSIX definition of iconv_open() would become *much* > more useful, if it actually specified a couple of encoding strings, and > what exactly they mean.
I second that. JAVA has a similar "minimal supported set of encodings" in its conversion facility. > "" multi-byte encoding of current LC_CTYPE locale > "UTF-8" UTF-8 (with overlong sequences being illegal) > "UTF-16" UTF-16 (same byte order as C's short) > "UTF-16BE" UTF-16 BigEndian > "UTF-16LE" UTF-16 LittleEndian > "UTF-32" UTF-32 (same byte order as C's long) > ... "UTF-16" and "UTF-32" are defined differently than "same byte order as C's short", in RFC 2781. It's better to refer to their lengthy definition in RFC 2781. > and perhaps even > > "EUC-JP", "EUC-KR", "EUC-TW", "GB18030" I don't think there is a normative, widely used definition of EUC-TW. And for GB18030, the fact that its official definition is in Chinese, not English, doesn't prevent different implementations by different vendors. Bruno -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
