On 6/2/2014 7:10 AM, Robin Becker wrote:

there seems to be an implicit assumption in python land that encoded
strings are the norm.

I don't know why you say that. To have a stream of bytes interpreted as characters, open in text mode and give the encoding. Otherwise, open in binary mode and apply whatever encoding you want. Image programs like Pil or Pillow assume that bytes have image encodings. Same idea.

> On virtually every computer I encounter that assumption is wrong.

Except for the std streams (see below), it is also not part of Python.

I will just point out that bytes are given meaning by encoding meaning into them. Unicode attempts to reduce the hundreds of text encodings to just a few, and mostly to just one for external storage and transmission.

In python I would have preferred for bytes to remain the default io

Do you really think that defaulting the open mode to 'rb' rather than 'rt' would be a better choice for newbies?

mechanism, at least that would allow me to decide if I need any decoding.

Assuming that 'rb' is actually needed more than 'rt' for you in particular, is it really such a burden to give a mode more often than not?

As the cat example
http://lucumr.pocoo.org/2014/5/12/everything-about-unicode/
showed these extra assumptions are sometimes really in the way.

This example is *only* about the *pre-opened* stdxyz streams. Python uses these to read characters from the keyboard and print characters to the screen in input, print, and the interactive interpreter. So they are open in text mode (which wraps binary read and write). The developers, knowing that people can and do write batch mode programs that avoid input and print, gave a documented way to convert the streams back to binary. (See the sys doc.)

The issue Armin ran into is this. He write a library module that makes sure the streams are binary. Someone else does the same. A program imports both modules, in either order. The conversion method referenced above raises an exception if one attempt to convert an already converted stream. Much of the extra code Armin published detects whether the steam is already binary or needs conversion.

The obvious solution is to enhance the conversion method so that one may say 'convert is needed, otherwise just pass'.

--
Terry Jan Reedy

--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to