Yung-Fong Tang wrote:
Same thing for JIS x0208 (a TWO and only TWO bytes character set, not a variable length character set). If I am processing a ISO-2022-JP message and in the JIS x0208 mode and I got a 0x24 0xa8 I know the boundary of that problem is 16 bits, not 8 -bits nor 32 bits.

Not true. You don't know if - a byte was dropped before or after 0x24 -> the first sequence is only 1 byte - a byte was corrupted to become 0xa8 -> the sequence consists of two bytes - a wild combination of multiple errors

With a single-unit encoding, you can always assume that an illegal unit is a one-unit error. With any multi-unit encoding, you can only guess.

markus

--
Opinions expressed here may not reflect my company's positions unless otherwise noted.




Reply via email to