On 6/3/2014 10:18 AM, Robin Becker wrote:

I think the idea that we only give meaning to binary data using
encodings is a bit limiting.

On the contrary, it is liberating. The fact that bits have no meaning other than 'a choice between two alterntives' means 1. any binary choice - 0/1, -/+, false/true, no/yes, closed/open, male/female, sad/happy, evil/good, low/high, and so on ad infinitum, can be encoded into a bit. Since any such pair could have been reversed, the mapping between bit states and the pair is arbitrary, and constitutes an encoding. 2. any discret or digitized information that constitutes a choice between multiple alternative can be encoded into a sequence of bits.

This crucial discovery is the basis of Shannon's 1947 paper and of the information age that started about then.

A zip or gif file has structure, but I don't think it's reasonable  to
>to regard such a file as having an encoding in the python unicode sense.

I an not quite sure what you are denying. Color encodings are encodings as much as character encodings, even if they encode different information. Both encode sensory experience and conceptual correlates into a sequences of bits usually organized for convenience into a sequence of bytes or other chunks.

There is another similarity. Text files often have at least two levels of encoding. First is the character encoding; that is all unicode handles. Then there is the text structure encoding, which is sometimes called the 'file format'. Most text files are at least structured into 'lines'. For this, they use encoded line endings, and there have been multiple choices for this and at least 2 still in common use (which is a nuisance).

Similarly, a pixel (bitmap!) image file must encode the color of each pixel and a higher-level structuring of pixels into a a 2D array of rows of lines. Just as with text, there have been and still are multiple encoding at both levels. Also, similarly, the receiver of an image must know what encoding the sender used.

Vector graphics is a different way of encoding certain types of images, and again there are multiple ways to encode the information into bits. The encoding hassle here is similar to that for text. One of the frustrations of tk is that it natively uses just one old dialect of postscript (.ps) to output screen images. One has to find and install an extension to a modern Scaled Vector Graphics (.svg) encoding.

Because Python is programed with lines of text, it must come with minimal text decoding. If Python were programmed with drawings, it would come with one or more drawing decoders and a drawing equivalent of a lexer. It might even have special 'rd' (read drawing) mode for open.

Terry Jan Reedy


Reply via email to