On 17 January 2014 21:37, Chris Barker <chris.bar...@noaa.gov> wrote:
>
> For the record, we've got a pretty good thread (not this good, though!) over
> on the numpy list about how to untangle the mess that has resulted from
> porting text-file-parsing code to py3 (and the underlying issue with the 'S'
> data type in numpy...)
>
> One note from the github issue:
> """
>  The use of asbytes originates only from the fact that b'%d' % (20,) does
> not work.
> """
>
> So yeah PEP 461! (even if too late for numpy...)

The discussion about numpy.loadtxt and the 'S' dtype is not relevant
to PEP 461.  PEP 461 is about facilitating handling ascii/binary
protocols and file formats. The loadtxt function is for reading text
files. Reading text files is already handled very well in Python 3.
The only caveat is that you need to specify the encoding when you open
the file.

The loadtxt function doesn't specify the encoding when it opens the
file so on Python 3 it gets the system default encoding when reading
from the file. Since the 'S' dtype is for an array of bytes the
loadtxt function has to encode the unicode strings before storing them
in the array. The function has no idea what encoding the user wants so
it just uses latin-1 leading to mojibake if the file content and
encoding are not compatible with latin-1 e.g.: utf-8.

The loadtxt function is a classic example of how *not* to do text and
whoever made it that way probably didn't understand unicode and the
Python 3 text model. If they did understand what they were doing then
they knew that they were implementing a dirty hack.

If you want to draw a relevant lesson from that thread in this one
then the lesson argues against PEP 461: adding back the bytes
formatting methods helps people who refuse to understand text
processing and continue implementing dirty hacks instead of doing it
properly.


Oscar
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to