On 1 June 2014 12:26, Steven D'Aprano <steve+comp.lang.pyt...@pearwood.info>

> "with cross-platform behavior preferred over system-dependent one" --
> It's not clear how cross-platform behaviour has anything to do with the
> Internet age. Python has preferred cross-platform behaviour forever,
> except for those features and modules which are explicitly intended to be
> interfaces to system-dependent features. (E.g. a lot of functions in the
> os module are thin wrappers around OS features. Hence the name of the
> module.)

There is the behaviour of defaulting input and output to the system
encoding. I personally think we would all be better off if Python (and
Java, and many other languages) defaulted to UTF-8. This hopefully would
eventually have the effect of producers changing to output UTF-8 by
default, and consumers learning to manually specify an encoding when it's
not UTF-8 (due to invalid codepoints).

I'm currently working on a product that interacts with lots of other
products. These other products can be using any encoding - but most of the
functions that interact with I/O assume the system default encoding of the
machine that is collecting the data. The product has been in production for
nearly a decade, so there's a lot of pushback against changes deep in the
code for fear that it will break working systems. The fact that they are
working largely by accident appears to escape them ...

FWIW, changing to use iso-latin-1 by default would be the most sensible
option (effectively treating everything as bytes), with the option for
another encoding if/when more information is known (e.g. there's often a
call to return the encoding, and the output of that call is guaranteed to
be ASCII).

Tim Delaney

Reply via email to