> However it is likely to be often wrong, and where the user's locale
> specifies an encoding like CP1252 then it will result in silent
> corruption rather than an immediate exception.

Why do you say that? Why do you think it will likely be often wrong?
Most likely, encoding text files with cp1252 will be exactly right,
and what the end user wanted.

> This is why I'm keen that by *default* Python should honour the UTF8
> signature when reading files; particularly given that programmers who
> don't/can't/won't understand encodings are likely to read files without
> specifying an encoding and a lot of the time it will *seem* to work.

That's probably a reasonable idea - but may also make things worse:
on writing, you'd still use cp1252, so you may end up outputting the
file in a different encoding. That would be particularly unfortunate
if you were merely performing some simple text replacement.

So whatever the API - there's always tradeoffs.

Regards,
Martin
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to