On 17 January 2014 21:37, Chris Barker <chris.bar...@noaa.gov> wrote: > > For the record, we've got a pretty good thread (not this good, though!) over > on the numpy list about how to untangle the mess that has resulted from > porting text-file-parsing code to py3 (and the underlying issue with the 'S' > data type in numpy...) > > One note from the github issue: > """ > The use of asbytes originates only from the fact that b'%d' % (20,) does > not work. > """ > > So yeah PEP 461! (even if too late for numpy...)
The discussion about numpy.loadtxt and the 'S' dtype is not relevant to PEP 461. PEP 461 is about facilitating handling ascii/binary protocols and file formats. The loadtxt function is for reading text files. Reading text files is already handled very well in Python 3. The only caveat is that you need to specify the encoding when you open the file. The loadtxt function doesn't specify the encoding when it opens the file so on Python 3 it gets the system default encoding when reading from the file. Since the 'S' dtype is for an array of bytes the loadtxt function has to encode the unicode strings before storing them in the array. The function has no idea what encoding the user wants so it just uses latin-1 leading to mojibake if the file content and encoding are not compatible with latin-1 e.g.: utf-8. The loadtxt function is a classic example of how *not* to do text and whoever made it that way probably didn't understand unicode and the Python 3 text model. If they did understand what they were doing then they knew that they were implementing a dirty hack. If you want to draw a relevant lesson from that thread in this one then the lesson argues against PEP 461: adding back the bytes formatting methods helps people who refuse to understand text processing and continue implementing dirty hacks instead of doing it properly. Oscar _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com