Re: [Python-Dev] Python3 "complexity"

Kristján Valur Jónsson Thu, 09 Jan 2014 05:03:57 -0800


> -----Original Message-----
> From: Paul Moore [mailto:[email protected]]
> Sent: 9. janúar 2014 10:53
> To: Kristján Valur Jónsson
> Cc: Stefan Ring; [email protected]
> > Moving to python 3, I found that this quickly caused problems.
> 
> You don't say what problems, but I assume encoding/decoding errors. So the
> files apparently weren't in the system encoding. OK, at that point I'd
> probably say to heck with it and use latin-1. Assuming I was sure that (a) I'd
> never hit a non-ascii compatible file (e.g., UTF16) and
> (b) I didn't have a decent means of knowing the encoding.
Right.  But even latin-1, or better, cp1252 (on windows) does not solve it 
because these have undefined
code points.  So you need 'surrogateescape' error handling as well.  Something 
that I didn't know at
the time, having just come from python 2 and knowing its Unicode model well.


> 
> One thing that genuinely is difficult is that because disk files don't have 
> any
> out-of-band data defining their encoding, it *can* be hard to know what
> encoding to use in an environment where more than one encoding is
> common. But this isn't really a Python issue - as I say, I've hit it with GNU
> tools, and I've had to explain the issue to colleagues using Java on many
> occasions. The key difference is that with grep, people blame the file,
> whereas with Python people blame the language :-) (Of course, with Java,
> people expect this sort of problem so they blame the perverseness of the
> universe as a whole... ;-))

Which reminds me, can Python3 read text files with BOM automatically yet?

K

_______________________________________________
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python3 "complexity"

Reply via email to