Tim Delaney <timothy.c.delaney <at> gmail.com> writes: > > I also should have been more clear that *in the particular situation I was talking about* iso-latin-1 as default would be the right thing to do, not in the general case. Quite often we won't know the correct encoding until we've executed a command via ssh - iso-latin-1 will allow us to extract the info we need (which will generally be 7-bit ASCII) without the possibility of an invalid encoding. Sure we may get mojibake, but that's better than the alternative when we don't yet know the correct encoding. > > Latin-1 is one of those legacy encodings which needs to die, not to be > entrenched as the default. My terminal uses UTF-8 by default (as itshould), and if I use the terminal to input "δжç", Python ought to seewhat I input, not Latin-1 moji-bake. > > > For some purposes, there needs to be a way to treat an arbitrary stream of bytes as an arbitrary stream of 8-bit characters. iso-latin-1 is a convenient way to do that. >
For that purpose, Python3 has the bytes() type. Read the data as is, then decode it to a string once you figured out its encoding. Wolfgang -- https://mail.python.org/mailman/listinfo/python-list