On 12/7/2017 4:48 PM, Victor Stinner wrote:

Ok, now comes the real question, open().

For open(), I used the example of a code snippet *writing* the content
of a directory (os.listdir) into a text file. Another example is to
read filenames from a text files but pass-through undecodable bytes
thanks to surrogateescape.

But Naoki explained that open() is commonly misused to open binary
files and Python should somehow fail badly to notify the developer of
their mistake.

So the real problem here is that open has a default mode of text. Instead of forcing the user to specify either "text" or "binary" when opening, text is used as a default, binary as an option to be specified.

I understand that default has a long history in Unix-land, dating at last as far back as 1977 when I first learned how to use the Unix open() function.

And now it would be an incompatible change to change it.

The real question is whether or not it is a good idea to change it... at this point in time, with Unicode and UTF-8 so prevalent, text and binary modes are far different than back in 1977, when they mostly just documented that this was a binary file that was being opened, and that one could more likely expect to see read() than fgets() in the following code.

If it were to be changed, one could add a text-mode option in 3.7, say "t" in the mode string, and a PendingDeprecationWarning for open calls without the specification of either t or b in the mode string.

In 3.8, the warning would be changed to DeprecationWarning.

In 3.9, all open calls would need to have either t or b, or would fail.

Meanwhile, back on the PEP 540 ranch, text mode open calls could immediately use surrogateescape, binary mode open calls would not, and unspecified open calls would not.
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to