On Fri, Oct 21, 2016 at 11:07:46AM +0100, Paul Moore wrote: > On 21 October 2016 at 10:53, Steven D'Aprano <st...@pearwood.info> wrote: > > On Wed, Oct 19, 2016 at 12:33:57PM -0700, Nathaniel Smith wrote: > > > >> I should also say, regarding your specific example, I guess it's an > >> open question whether we would want list_iterator.__iterclose__ to > >> actually do anything. It could flip the iterator to a state where it > >> always raises StopIteration, > > > > That seems like the most obvious.
I've changed my mind -- I think maybe it should do nothing, and preserve the current behaviour of lists. I'm now more concerned with keeping current behaviour as much as possible than creating some sort of consistent error condition for all iterators. Consistency is over-rated, and we already have inconsistency here: file iterators behave differently from list iterators, because they can be closed: py> f = open('/proc/mdstat', 'r') py> a = list(f) py> b = list(f) py> len(a), len(b) (20, 0) py> f.close() py> c = list(f) Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: I/O operation on closed file. We don't need to add a close() to list iterators just so they are consistent with files. Just let __iterclose__ be a no-op. > So - does this mean "unless you understand what preserve() does, > you're OK to not use it and your code will continue to work as > before"? If so, then I'd be happy with this. Almost. Code like this will behave exactly the same as it currently does: for x in it: process(x) y = list(it) If it is a file object, the second call to list() will raise ValueError; if it is a list_iterator, or generator, etc., y will be an empty list. That part (I think) shouldn't change. What *will* change is code that partially processes the iterator in two different places. A simple example: py> it = iter([1, 2, 3, 4, 5, 6]) py> for x in it: ... if x == 4: break ... py> for x in it: ... print(x) ... 5 6 This *may* change. With this proposal, the first loop will "close" the iterator when you exit from the loop. For a list, there's no finaliser, no __del__ to call, so we can keep the current behaviour and nobody will notice any difference. But if `it` is a file iterator instead of a list iterator, the file will be closed when you exit the first for-loop, and the second loop will raise ValueError. That will be different. The fix here is simple: protect the first call from closing: for x in itertools.preserve(it): # preserve, protect, whatever ... Or, if `it` is your own class, give it a __iterclose__ method that does nothing. This is a backwards-incompatible change, so I think we would need to do this: (1) In Python 3.7, we introduce a __future__ directive: from __future__ import iterclose to enable the new behaviour. (Remember, future directives apply on a module-by-module basis.) (2) Without the directive, we keep the old behaviour, except that warnings are raised if something will change. (3) Then in 3.8 iterclose becomes the default, the warnings go away, and the new behaviour just happens. If that's too fast for people, we could slow it down: (1) Add the future directive to Python 3.7; (2) but no warnings by default (you have to opt-in to the warnings with an environment variable, or command-line switch). (3) Then in 3.8 the warnings are on by default; (4) And the iterclose behaviour doesn't become standard until 3.9. That means if this change worries you, you can ignore it until you migrate to 3.8 (which won't be production-ready until about 2020 or so), and don't have to migrate your code until 3.9, which will be a year or two later. But early adopters can start targetting the new functionality from 3.7 if they like. I don't think there's any need for a __future__ directive for aiterclose, since there's not enough backwards-incompatibility to care about. (I think, but don't mind if people disagree.) That can happen starting in 3.7, and when people complain that their syncronous generators don't have deterministic garbage collection like their asyncronous ones do, we can point them at the future directive. Bottom line is: at first I thought this was a scary change that would break too much code. But now I think it won't break much, and we can ease into it really slowly over two or three releases. So I think that the cost is probably low. I'm still not sure on how great the benefit will be, but I'm leaning towards a +1 on this. -- Steve _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/