<anbu at peoplestring dot com> wrote:
Document encoded in SCSU or BOCU-1, given that the document contains
only ASCII characters, may appear corrupt on a system that doesn't
recognise SCSU or BOCU-1.
This is the curious point of view that ASCII compatibility (or
transparency) is a bad thing. It does not apply to BOCU-1, which is not
ASCII-transparent.
Documents encoded in *any* format are likely to appear corrupt on a
system that doesn't recognize the encoding. They are guaranteed to
appear corrupt if character boundaries do not align with byte
boundaries, which is what you propose here.
01100001100101011001101110100101010110011000101010100101011101110101
If I'm going to use a variable-length, non-byte-aligned encoding, where
there is no chance of realigning in case of a flipped or dropped bit
(which seems to be of great concern to many people), I might as well go
ahead and use a Huffman or LZ type of encoding (or a combination, like
DEFLATE).
Is this the same encoding you were proposing a little over a year ago,
or an outgrowth of the same ideas?
--
Doug Ewell | Thornton, Colorado, USA
http://www.ewellic.org | @DougEwell