<anbu at peoplestring dot com> wrote:

Document encoded in SCSU or BOCU-1, given that the document contains
only ASCII characters, may appear corrupt on a system that doesn't
recognise SCSU or BOCU-1.

This is the curious point of view that ASCII compatibility (or transparency) is a bad thing. It does not apply to BOCU-1, which is not ASCII-transparent.

Documents encoded in *any* format are likely to appear corrupt on a system that doesn't recognize the encoding. They are guaranteed to appear corrupt if character boundaries do not align with byte boundaries, which is what you propose here.

01100001100101011001101110100101010110011000101010100101011101110101

If I'm going to use a variable-length, non-byte-aligned encoding, where there is no chance of realigning in case of a flipped or dropped bit (which seems to be of great concern to many people), I might as well go ahead and use a Huffman or LZ type of encoding (or a combination, like DEFLATE).

Is this the same encoding you were proposing a little over a year ago, or an outgrowth of the same ideas?

--
Doug Ewell | Thornton, Colorado, USA
http://www.ewellic.org | @DougEwell ­

Reply via email to