M.-A. Lemburg wrote: > "Martin v. Löwis" wrote: >> Hmm - what do you mean by "normally"? Normally, text files are meant >> for human readers, not for exchange between programs. > > It's rather common to exchange text files between users... and > in form of XML and CSV files, these also tend to get used as > input for programs every now and then ;-)
>From a non-Unicode-maven point of view, this seems like a case where practicality beats purity. The pure solution would be to refuse to guess and force application developers to deal with this (in full knowledge that they usually won't, since too many applications are still written by English speakers that assume everyone uses ASCII or latin-1). The practical solution is to guess an encoding that should work for files that are restricted to a single machine, or a network of machines configured to all use the same default text file encoding. At least the latter approach only runs into trouble when there is a genuine encoding mismatch problem. A Python that refused to guess a plausible default encoding for text files would cause problems *all* the time. Use the encoding recommended by the locale eliminates a lot of false alarms at the potential cost of making the true faults more difficult to analyse when they do arise. That sounds like a reasonable trade-off from where I'm standing. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia --------------------------------------------------------------- _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com