[issue22128] patch: steer people away from codecs.open

2021-04-16 Thread Irit Katriel


Change by Irit Katriel :


--
stage:  -> resolved
status: pending -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22128] patch: steer people away from codecs.open

2021-03-17 Thread Irit Katriel


Irit Katriel  added the comment:

Since Martin corrected the docs in issue 19548 for python 3, and python 2 is no 
longer relevant, I believe this can be closed.

--
nosy: +iritkatriel
resolution:  -> duplicate
status: open -> pending
superseder:  -> 'codecs' module docs improvements

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22128] patch: steer people away from codecs.open

2015-01-02 Thread Martin Panter

Martin Panter added the comment:

Just pointing out there is a patch for Issue 19548 for Python 3 which also adds 
a pointer to the builtin open() function and updates the codecs.open() caveats. 
That issue doesn’t touch Python 2 though.

--
nosy: +vadmium

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22128] patch: steer people away from codecs.open

2014-08-05 Thread Antoine Pitrou

Antoine Pitrou added the comment:

> I don't think it's useful to tell people:
> * use codecs.open() on Python 2.4, 2.5, 2.6
> * use io.open() on Python 2.7 (io is too slow on 2.6 to be a real alternative 
> to codecs.open())
> * use open() on Python 3.4+

Instead we can tell them to use io.open() on all versions from 2.7 and upwards.
2.6 is dead and won't receive any documentation updates anyway.

--
nosy: +pitrou

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22128] patch: steer people away from codecs.open

2014-08-05 Thread Frank van Dijk

Changes by Frank van Dijk :


Added file: http://bugs.python.org/file36274/codecsopen3a.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22128] patch: steer people away from codecs.open

2014-08-05 Thread Frank van Dijk

Changes by Frank van Dijk :


Added file: http://bugs.python.org/file36273/codecsopen2a.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22128] patch: steer people away from codecs.open

2014-08-04 Thread Frank van Dijk

Frank van Dijk added the comment:

> Marc-Andre Lemburg added the comment:
> 
> Pointing people to io.open() as alternative to codecs.open() is a good idea, 
> but that doesn't make codecs.open() less useful.
> 
> The reason why codecs.open() uses binary mode is to avoid issues with 
> automatic newline conversion getting in the way of the file's encoding. Think 
> of e.g. UTF-16 encoded files that use newlines.

disabling text mode on the underlying file handle to keep a UTF-16 code unit 
like 0x010a from getting mangled works, but no newline conversion is a high 
price to pay. Newline conversion should (conceptually) be done before encoding 
and after decoding. io.open() does it right.

> 
> Note that codecs allow handling newlines on a line-by-line bases via the 
> .readline() keepends parameter, so issues with Windows vs. Unix can be worked 
> around explicitly. Since default is to keep line ends, no data loss occurs 
> and application code can deal with line ends as it sees fit.

Trouble is, your average python coder won't do exhaustive research on the pros 
and cons of the various options for I/O available and the pros and cons of 
dealing with platform differences at the application level. They'll just use 
the open() builtin, then realize they need utf-8 output or whatever, google 
"python write utf-8" or browse the unicode HOWTO, see a very familiar looking 
API and assume it'll behave just like open()

> 
> As it stands, I'm -1 on this patch, but would be +1 on mentioning io.open() 
> as alternative to codecs.open() with a slightly different approach to line 
> ends.

What would that mean concretely ? Undoing the change to the unicode HOWTO and 
instead adding a remark along the lines of "The codecs.open() function does not 
have the automatic newline conversion features that the builtin open() function 
provides to make reading and writing text files platform independent. If you 
need automatic newline conversion for the Unicode data you read and write, 
consider using io.open() instead." ?

I could live with that.

> 
> I don't think it's useful to tell people:
> * use codecs.open() on Python 2.4, 2.5, 2.6
> * use io.open() on Python 2.7 (io is too slow on 2.6 to be a real alternative 
> to codecs.open())
> * use open() on Python 3.4+

The unicode HOWTO already recommends open() on all 3.x versions of the 
documentation at docs.python.org.

If you run 2.4 and 2.5 and you're adding new python software to your ancient 
system without upgrading python itself the only thing that could happen is that 
you'll get a clear-cut error if that new software imports io.

I can't judge how much of a problem slowness of the io module is in 2.6 or how 
much 'market share' 2.6 has left, but I'll note that correctness trumps 
performance. I'll also note that we're not changing any code here, nor will 
there be a rush of coders racing to get their existing apps and frameworks in 
line with the new decree.

All we're doing is giving average python programmers a better chance to 
discover what the drop in replacement for open() is or why that helpful tip 
found on the interwebs left them with a subtly mangled text file that looks 
really weird in notepad and makes git complain.

> 
> codecs.open() works the same across all these Python versions.
> 
> --
> 
> ___
> Python tracker 
> 
> ___

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22128] patch: steer people away from codecs.open

2014-08-03 Thread Marc-Andre Lemburg

Marc-Andre Lemburg added the comment:

Pointing people to io.open() as alternative to codecs.open() is a good idea, 
but that doesn't make codecs.open() less useful.

The reason why codecs.open() uses binary mode is to avoid issues with automatic 
newline conversion getting in the way of the file's encoding. Think of e.g. 
UTF-16 encoded files that use newlines.

Note that codecs allow handling newlines on a line-by-line bases via the 
.readline() keepends parameter, so issues with Windows vs. Unix can be worked 
around explicitly. Since default is to keep line ends, no data loss occurs and 
application code can deal with line ends as it sees fit.

As it stands, I'm -1 on this patch, but would be +1 on mentioning io.open() as 
alternative to codecs.open() with a slightly different approach to line ends.

I don't think it's useful to tell people:
* use codecs.open() on Python 2.4, 2.5, 2.6
* use io.open() on Python 2.7 (io is too slow on 2.6 to be a real alternative 
to codecs.open())
* use open() on Python 3.4+

codecs.open() works the same across all these Python versions.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22128] patch: steer people away from codecs.open

2014-08-03 Thread STINNER Victor

STINNER Victor added the comment:

See also my PEP 400:
http://legacy.python.org/dev/peps/pep-0400/

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22128] patch: steer people away from codecs.open

2014-08-03 Thread Serhiy Storchaka

Changes by Serhiy Storchaka :


--
nosy: +doerwalter, haypo, lemburg

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22128] patch: steer people away from codecs.open

2014-08-03 Thread Frank van Dijk

Changes by Frank van Dijk :


Added file: http://bugs.python.org/file36235/codecsopen3.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22128] patch: steer people away from codecs.open

2014-08-03 Thread Frank van Dijk

New submission from Frank van Dijk:

stackoverflow.com has a zillion answers recommending the use of codecs.open() 
as a unicode capable drop in replacement for open(). This probably means that 
there is still a lot of code being written that uses codecs.open(). That's bad 
thing because of codecs.open()'s lack of newline conversion. A lot of that code 
will 
- have compatibility issues when it is moved between unix and windows
- silently break text files on windows, leading to issues further downstream 
(confusing other tools, messing up revision control histories)

The problem has been fixed with io.open() in 2.x and open() in 3.x. 
Unfortunately the 2.7 unicode HOWTO still recommends the use of codecs.open(). 
The 2.7 and the 3.x documentation of codecs.open() doesn't refer the reader to 
better alternatives.

The attached patches fix that.

The only downside I see is that newly written code that uses the better 
alternatives would be incompatible with 2.5 and older. However croaking on a 
small minority of systems is better than silently disrupting workflows, causing 
platform incompatibilities, and inviting flaky workarounds.

The 2.7 patch makes the unicode HOWTO recommend io.open() instead of 
codecs.open(). Both patches change the codecs.open() documentation to refer to 
io.open() or (on 3.x) open().

Additionally I removed the "data loss" explanation from codecs.open()'s note 
about its lack of newline conversion. It is not particularly helpful 
information and it is not entirely correct (data loss could also have been 
avoided by doing newline conversion before encoding and after decoding)

--
assignee: docs@python
components: Documentation
files: codecsopen2.patch
keywords: patch
messages: 224632
nosy: Frank.van.Dijk, docs@python
priority: normal
severity: normal
status: open
title: patch: steer people away from codecs.open
type: behavior
versions: Python 2.7, Python 3.4, Python 3.5
Added file: http://bugs.python.org/file36234/codecsopen2.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com