Wu Yongwei said: > IMHO, if the input contains possibly embedded nulls, it is then simply > NOT "null-terminated string". The concept of null-terminated string > should only be used with data without embedded nulls, such as ASCII, > EUC-JP, EUC-CN, UTF-8, and so on. If this is not the case (as in > UTF-16), using data length along with the data make your life much easier.
I agree. If it were up to me I would either prefix the size or stick to favorable encodings. Unfortunately there were many people that came before us that didn't. There's not much I can do about that. Consider mail software, the W3C is pushing UTF-16, MS protocols use 16 bit encodings with embedded nulls all over the place, Shift-JIS has embedded nulls, etc.... Mike -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
