Wu Yongwei said:
> IMHO, if the input contains possibly embedded nulls, it is then simply
> NOT "null-terminated string". The concept of null-terminated string
> should only be used with data without embedded nulls, such as ASCII,
> EUC-JP, EUC-CN, UTF-8, and so on. If this is not the case (as in
> UTF-16), using data length along with the data make your life much easier.

I agree. If it were up to me I would either prefix the size or stick to
favorable encodings. Unfortunately there were many people that came before
us that didn't. There's not much I can do about that. Consider mail
software, the W3C is pushing UTF-16, MS protocols use 16 bit encodings
with embedded nulls all over the place, Shift-JIS has embedded nulls,
etc....

Mike

--
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Reply via email to