This is a very interesting proposal. I just wanted to share something I found in my quick search:
http://stackoverflow.com/questions/14797930/python-custom-iterator-close-a-file-on-stopiteration Could you explain why the accepted answer there doesn't address this issue? class Parse(object): """A generator that iterates through a file""" def __init__(self, path): self.path = path def __iter__(self): with open(self.path) as f: yield from f Best, Neil On Wednesday, October 19, 2016 at 12:39:34 AM UTC-4, Nathaniel Smith wrote: > > Hi all, > > I'd like to propose that Python's iterator protocol be enhanced to add > a first-class notion of completion / cleanup. > > This is mostly motivated by thinking about the issues around async > generators and cleanup. Unfortunately even though PEP 525 was accepted > I found myself unable to stop pondering this, and the more I've > pondered the more convinced I've become that the GC hooks added in PEP > 525 are really not enough, and that we'll regret it if we stick with > them, or at least with them alone :-/. The strategy here is pretty > different -- it's an attempt to dig down and make a fundamental > improvement to the language that fixes a number of long-standing rough > spots, including async generators. > > The basic concept is relatively simple: just adding a '__iterclose__' > method that 'for' loops call upon completion, even if that's via break > or exception. But, the overall issue is fairly complicated + iterators > have a large surface area across the language, so the text below is > pretty long. Mostly I wrote it all out to convince myself that there > wasn't some weird showstopper lurking somewhere :-). For a first pass > discussion, it probably makes sense to mainly focus on whether the > basic concept makes sense? The main rationale is at the top, but the > details are there too for those who want them. > > Also, for *right* now I'm hoping -- probably unreasonably -- to try to > get the async iterator parts of the proposal in ASAP, ideally for > 3.6.0 or 3.6.1. (I know this is about the worst timing for a proposal > like this, which I apologize for -- though async generators are > provisional in 3.6, so at least in theory changing them is not out of > the question.) So again, it might make sense to focus especially on > the async parts, which are a pretty small and self-contained part, and > treat the rest of the proposal as a longer-term plan provided for > context. The comparison to PEP 525 GC hooks comes right after the > initial rationale. > > Anyway, I'll be interested to hear what you think! > > -n > > ------------------ > > Abstract > ======== > > We propose to extend the iterator protocol with a new > ``__(a)iterclose__`` slot, which is called automatically on exit from > ``(async) for`` loops, regardless of how they exit. This allows for > convenient, deterministic cleanup of resources held by iterators > without reliance on the garbage collector. This is especially valuable > for asynchronous generators. > > > Note on timing > ============== > > In practical terms, the proposal here is divided into two separate > parts: the handling of async iterators, which should ideally be > implemented ASAP, and the handling of regular iterators, which is a > larger but more relaxed project that can't start until 3.7 at the > earliest. But since the changes are closely related, and we probably > don't want to end up with async iterators and regular iterators > diverging in the long run, it seems useful to look at them together. > > > Background and motivation > ========================= > > Python iterables often hold resources which require cleanup. For > example: ``file`` objects need to be closed; the `WSGI spec > <https://www.python.org/dev/peps/pep-0333/>`_ adds a ``close`` method > on top of the regular iterator protocol and demands that consumers > call it at the appropriate time (though forgetting to do so is a > `frequent source of bugs > <http://blog.dscpl.com.au/2012/10/obligations-for-calling-close-on.html>`_); > > and PEP 342 (based on PEP 325) extended generator objects to add a > ``close`` method to allow generators to clean up after themselves. > > Generally, objects that need to clean up after themselves also define > a ``__del__`` method to ensure that this cleanup will happen > eventually, when the object is garbage collected. However, relying on > the garbage collector for cleanup like this causes serious problems in > at least two cases: > > - In Python implementations that do not use reference counting (e.g. > PyPy, Jython), calls to ``__del__`` may be arbitrarily delayed -- yet > many situations require *prompt* cleanup of resources. Delayed cleanup > produces problems like crashes due to file descriptor exhaustion, or > WSGI timing middleware that collects bogus times. > > - Async generators (PEP 525) can only perform cleanup under the > supervision of the appropriate coroutine runner. ``__del__`` doesn't > have access to the coroutine runner; indeed, the coroutine runner > might be garbage collected before the generator object. So relying on > the garbage collector is effectively impossible without some kind of > language extension. (PEP 525 does provide such an extension, but it > has a number of limitations that this proposal fixes; see the > "alternatives" section below for discussion.) > > Fortunately, Python provides a standard tool for doing resource > cleanup in a more structured way: ``with`` blocks. For example, this > code opens a file but relies on the garbage collector to close it:: > > def read_newline_separated_json(path): > for line in open(path): > yield json.loads(line) > > for document in read_newline_separated_json(path): > ... > > and recent versions of CPython will point this out by issuing a > ``ResourceWarning``, nudging us to fix it by adding a ``with`` block:: > > def read_newline_separated_json(path): > with open(path) as file_handle: # <-- with block > for line in file_handle: > yield json.loads(line) > > for document in read_newline_separated_json(path): # <-- outer for loop > ... > > But there's a subtlety here, caused by the interaction of ``with`` > blocks and generators. ``with`` blocks are Python's main tool for > managing cleanup, and they're a powerful one, because they pin the > lifetime of a resource to the lifetime of a stack frame. But this > assumes that someone will take care of cleaning up the stack frame... > and for generators, this requires that someone ``close`` them. > > In this case, adding the ``with`` block *is* enough to shut up the > ``ResourceWarning``, but this is misleading -- the file object cleanup > here is still dependent on the garbage collector. The ``with`` block > will only be unwound when the ``read_newline_separated_json`` > generator is closed. If the outer ``for`` loop runs to completion then > the cleanup will happen immediately; but if this loop is terminated > early by a ``break`` or an exception, then the ``with`` block won't > fire until the generator object is garbage collected. > > The correct solution requires that all *users* of this API wrap every > ``for`` loop in its own ``with`` block:: > > with closing(read_newline_separated_json(path)) as genobj: > for document in genobj: > ... > > This gets even worse if we consider the idiom of decomposing a complex > pipeline into multiple nested generators:: > > def read_users(path): > with closing(read_newline_separated_json(path)) as gen: > for document in gen: > yield User.from_json(document) > > def users_in_group(path, group): > with closing(read_users(path)) as gen: > for user in gen: > if user.group == group: > yield user > > In general if you have N nested generators then you need N+1 ``with`` > blocks to clean up 1 file. And good defensive programming would > suggest that any time we use a generator, we should assume the > possibility that there could be at least one ``with`` block somewhere > in its (potentially transitive) call stack, either now or in the > future, and thus always wrap it in a ``with``. But in practice, > basically nobody does this, because programmers would rather write > buggy code than tiresome repetitive code. In simple cases like this > there are some workarounds that good Python developers know (e.g. in > this simple case it would be idiomatic to pass in a file handle > instead of a path and move the resource management to the top level), > but in general we cannot avoid the use of ``with``/``finally`` inside > of generators, and thus dealing with this problem one way or another. > When beauty and correctness fight then beauty tends to win, so it's > important to make correct code beautiful. > > Still, is this worth fixing? Until async generators came along I would > have argued yes, but that it was a low priority, since everyone seems > to be muddling along okay -- but async generators make it much more > urgent. Async generators cannot do cleanup *at all* without some > mechanism for deterministic cleanup that people will actually use, and > async generators are particularly likely to hold resources like file > descriptors. (After all, if they weren't doing I/O, they'd be > generators, not async generators.) So we have to do something, and it > might as well be a comprehensive fix to the underlying problem. And > it's much easier to fix this now when async generators are first > rolling out, then it will be to fix it later. > > The proposal itself is simple in concept: add a ``__(a)iterclose__`` > method to the iterator protocol, and have (async) ``for`` loops call > it when the loop is exited, even if this occurs via ``break`` or > exception unwinding. Effectively, we're taking the current cumbersome > idiom (``with`` block + ``for`` loop) and merging them together into a > fancier ``for``. This may seem non-orthogonal, but makes sense when > you consider that the existence of generators means that ``with`` > blocks actually depend on iterator cleanup to work reliably, plus > experience showing that iterator cleanup is often a desireable feature > in its own right. > > > Alternatives > ============ > > PEP 525 asyncgen hooks > ---------------------- > > PEP 525 proposes a `set of global thread-local hooks managed by new > ``sys.{get/set}_asyncgen_hooks()`` functions > <https://www.python.org/dev/peps/pep-0525/#finalization>`_, which > allow event loops to integrate with the garbage collector to run > cleanup for async generators. In principle, this proposal and PEP 525 > are complementary, in the same way that ``with`` blocks and > ``__del__`` are complementary: this proposal takes care of ensuring > deterministic cleanup in most cases, while PEP 525's GC hooks clean up > anything that gets missed. But ``__aiterclose__`` provides a number of > advantages over GC hooks alone: > > - The GC hook semantics aren't part of the abstract async iterator > protocol, but are instead restricted `specifically to the async > generator concrete type <XX find and link Yury's email saying this>`_. > If you have an async iterator implemented using a class, like:: > > class MyAsyncIterator: > async def __anext__(): > ... > > then you can't refactor this into an async generator without > changing its semantics, and vice-versa. This seems very unpythonic. > (It also leaves open the question of what exactly class-based async > iterators are supposed to do, given that they face exactly the same > cleanup problems as async generators.) ``__aiterclose__``, on the > other hand, is defined at the protocol level, so it's duck-type > friendly and works for all iterators, not just generators. > > - Code that wants to work on non-CPython implementations like PyPy > cannot in general rely on GC for cleanup. Without ``__aiterclose__``, > it's more or less guaranteed that developers who develop and test on > CPython will produce libraries that leak resources when used on PyPy. > Developers who do want to target alternative implementations will > either have to take the defensive approach of wrapping every ``for`` > loop in a ``with`` block, or else carefully audit their code to figure > out which generators might possibly contain cleanup code and add > ``with`` blocks around those only. With ``__aiterclose__``, writing > portable code becomes easy and natural. > > - An important part of building robust software is making sure that > exceptions always propagate correctly without being lost. One of the > most exciting things about async/await compared to traditional > callback-based systems is that instead of requiring manual chaining, > the runtime can now do the heavy lifting of propagating errors, making > it *much* easier to write robust code. But, this beautiful new picture > has one major gap: if we rely on the GC for generator cleanup, then > exceptions raised during cleanup are lost. So, again, with > ``__aiterclose__``, developers who care about this kind of robustness > will either have to take the defensive approach of wrapping every > ``for`` loop in a ``with`` block, or else carefully audit their code > to figure out which generators might possibly contain cleanup code. > ``__aiterclose__`` plugs this hole by performing cleanup in the > caller's context, so writing more robust code becomes the path of > least resistance. > > - The WSGI experience suggests that there exist important > iterator-based APIs that need prompt cleanup and cannot rely on the > GC, even in CPython. For example, consider a hypothetical WSGI-like > API based around async/await and async iterators, where a response > handler is an async generator that takes request headers + an async > iterator over the request body, and yields response headers + the > response body. (This is actually the use case that got me interested > in async generators in the first place, i.e. this isn't hypothetical.) > If we follow WSGI in requiring that child iterators must be closed > properly, then without ``__aiterclose__`` the absolute most > minimalistic middleware in our system looks something like:: > > async def noop_middleware(handler, request_header, request_body): > async with aclosing(handler(request_body, request_body)) as aiter: > async for response_item in aiter: > yield response_item > > Arguably in regular code one can get away with skipping the ``with`` > block around ``for`` loops, depending on how confident one is that one > understands the internal implementation of the generator. But here we > have to cope with arbitrary response handlers, so without > ``__aiterclose__``, this ``with`` construction is a mandatory part of > every middleware. > > ``__aiterclose__`` allows us to eliminate the mandatory boilerplate > and an extra level of indentation from every middleware:: > > async def noop_middleware(handler, request_header, request_body): > async for response_item in handler(request_header, request_body): > yield response_item > > So the ``__aiterclose__`` approach provides substantial advantages > over GC hooks. > > This leaves open the question of whether we want a combination of GC > hooks + ``__aiterclose__``, or just ``__aiterclose__`` alone. Since > the vast majority of generators are iterated over using a ``for`` loop > or equivalent, ``__aiterclose__`` handles most situations before the > GC has a chance to get involved. The case where GC hooks provide > additional value is in code that does manual iteration, e.g.:: > > agen = fetch_newline_separated_json_from_url(...) > while True: > document = await type(agen).__anext__(agen) > if document["id"] == needle: > break > # doesn't do 'await agen.aclose()' > > If we go with the GC-hooks + ``__aiterclose__`` approach, this > generator will eventually be cleaned up by GC calling the generator > ``__del__`` method, which then will use the hooks to call back into > the event loop to run the cleanup code. > > If we go with the no-GC-hooks approach, this generator will eventually > be garbage collected, with the following effects: > > - its ``__del__`` method will issue a warning that the generator was > not closed (similar to the existing "coroutine never awaited" > warning). > > - The underlying resources involved will still be cleaned up, because > the generator frame will still be garbage collected, causing it to > drop references to any file handles or sockets it holds, and then > those objects's ``__del__`` methods will release the actual operating > system resources. > > - But, any cleanup code inside the generator itself (e.g. logging, > buffer flushing) will not get a chance to run. > > The solution here -- as the warning would indicate -- is to fix the > code so that it calls ``__aiterclose__``, e.g. by using a ``with`` > block:: > > async with aclosing(fetch_newline_separated_json_from_url(...)) as > agen: > while True: > document = await type(agen).__anext__(agen) > if document["id"] == needle: > break > > Basically in this approach, the rule would be that if you want to > manually implement the iterator protocol, then it's your > responsibility to implement all of it, and that now includes > ``__(a)iterclose__``. > > GC hooks add non-trivial complexity in the form of (a) new global > interpreter state, (b) a somewhat complicated control flow (e.g., > async generator GC always involves resurrection, so the details of PEP > 442 are important), and (c) a new public API in asyncio (``await > loop.shutdown_asyncgens()``) that users have to remember to call at > the appropriate time. (This last point in particular somewhat > undermines the argument that GC hooks provide a safe backup to > guarantee cleanup, since if ``shutdown_asyncgens()`` isn't called > correctly then I *think* it's possible for generators to be silently > discarded without their cleanup code being called; compare this to the > ``__aiterclose__``-only approach where in the worst case we still at > least get a warning printed. This might be fixable.) All this > considered, GC hooks arguably aren't worth it, given that the only > people they help are those who want to manually call ``__anext__`` yet > don't want to manually call ``__aiterclose__``. But Yury disagrees > with me on this :-). And both options are viable. > > > Always inject resources, and do all cleanup at the top level > ------------------------------------------------------------ > > It was suggested on python-dev (XX find link) that a pattern to avoid > these problems is to always pass resources in from above, e.g. > ``read_newline_separated_json`` should take a file object rather than > a path, with cleanup handled at the top level:: > > def read_newline_separated_json(file_handle): > for line in file_handle: > yield json.loads(line) > > def read_users(file_handle): > for document in read_newline_separated_json(file_handle): > yield User.from_json(document) > > with open(path) as file_handle: > for user in read_users(file_handle): > ... > > This works well in simple cases; here it lets us avoid the "N+1 > ``with`` blocks problem". But unfortunately, it breaks down quickly > when things get more complex. Consider if instead of reading from a > file, our generator was reading from a streaming HTTP GET request -- > while handling redirects and authentication via OAUTH. Then we'd > really want the sockets to be managed down inside our HTTP client > library, not at the top level. Plus there are other cases where > ``finally`` blocks embedded inside generators are important in their > own right: db transaction management, emitting logging information > during cleanup (one of the major motivating use cases for WSGI > ``close``), and so forth. So this is really a workaround for simple > cases, not a general solution. > > > More complex variants of __(a)iterclose__ > ----------------------------------------- > > The semantics of ``__(a)iterclose__`` are somewhat inspired by > ``with`` blocks, but context managers are more powerful: > ``__(a)exit__`` can distinguish between a normal exit versus exception > unwinding, and in the case of an exception it can examine the > exception details and optionally suppress propagation. > ``__(a)iterclose__`` as proposed here does not have these powers, but > one can imagine an alternative design where it did. > > However, this seems like unwarranted complexity: experience suggests > that it's common for iterables to have ``close`` methods, and even to > have ``__exit__`` methods that call ``self.close()``, but I'm not > aware of any common cases that make use of ``__exit__``'s full power. > I also can't think of any examples where this would be useful. And it > seems unnecessarily confusing to allow iterators to affect flow > control by swallowing exceptions -- if you're in a situation where you > really want that, then you should probably use a real ``with`` block > anyway. > > > Specification > ============= > > This section describes where we want to eventually end up, though > there are some backwards compatibility issues that mean we can't jump > directly here. A later section describes the transition plan. > > > Guiding principles > ------------------ > > Generally, ``__(a)iterclose__`` implementations should: > > - be idempotent, > - perform any cleanup that is appropriate on the assumption that the > iterator will not be used again after ``__(a)iterclose__`` is called. > In particular, once ``__(a)iterclose__`` has been called then calling > ``__(a)next__`` produces undefined behavior. > > And generally, any code which starts iterating through an iterable > with the intention of exhausting it, should arrange to make sure that > ``__(a)iterclose__`` is eventually called, whether or not the iterator > is actually exhausted. > > > Changes to iteration > -------------------- > > The core proposal is the change in behavior of ``for`` loops. Given > this Python code:: > > for VAR in ITERABLE: > LOOP-BODY > else: > ELSE-BODY > > we desugar to the equivalent of:: > > _iter = iter(ITERABLE) > _iterclose = getattr(type(_iter), "__iterclose__", lambda: None) > try: > traditional-for VAR in _iter: > LOOP-BODY > else: > ELSE-BODY > finally: > _iterclose(_iter) > > where the "traditional-for statement" here is meant as a shorthand for > the classic 3.5-and-earlier ``for`` loop semantics. > > Besides the top-level ``for`` statement, Python also contains several > other places where iterators are consumed. For consistency, these > should call ``__iterclose__`` as well using semantics equivalent to > the above. This includes: > > - ``for`` loops inside comprehensions > - ``*`` unpacking > - functions which accept and fully consume iterables, like > ``list(it)``, ``tuple(it)``, ``itertools.product(it1, it2, ...)``, and > others. > > > Changes to async iteration > -------------------------- > > We also make the analogous changes to async iteration constructs, > except that the new slot is called ``__aiterclose__``, and it's an > async method that gets ``await``\ed. > > > Modifications to basic iterator types > ------------------------------------- > > Generator objects (including those created by generator comprehensions): > - ``__iterclose__`` calls ``self.close()`` > - ``__del__`` calls ``self.close()`` (same as now), and additionally > issues a ``ResourceWarning`` if the generator wasn't exhausted. This > warning is hidden by default, but can be enabled for those who want to > make sure they aren't inadverdantly relying on CPython-specific GC > semantics. > > Async generator objects (including those created by async generator > comprehensions): > - ``__aiterclose__`` calls ``self.aclose()`` > - ``__del__`` issues a ``RuntimeWarning`` if ``aclose`` has not been > called, since this probably indicates a latent bug, similar to the > "coroutine never awaited" warning. > > QUESTION: should file objects implement ``__iterclose__`` to close the > file? On the one hand this would make this change more disruptive; on > the other hand people really like writing ``for line in open(...): > ...``, and if we get used to iterators taking care of their own > cleanup then it might become very weird if files don't. > > > New convenience functions > ------------------------- > > The ``itertools`` module gains a new iterator wrapper that can be used > to selectively disable the new ``__iterclose__`` behavior:: > > # QUESTION: I feel like there might be a better name for this one? > class preserve(iterable): > def __init__(self, iterable): > self._it = iter(iterable) > > def __iter__(self): > return self > > def __next__(self): > return next(self._it) > > def __iterclose__(self): > # Swallow __iterclose__ without passing it on > pass > > Example usage (assuming that file objects implements ``__iterclose__``):: > > with open(...) as handle: > # Iterate through the same file twice: > for line in itertools.preserve(handle): > ... > handle.seek(0) > for line in itertools.preserve(handle): > ... > > The ``operator`` module gains two new functions, with semantics > equivalent to the following:: > > def iterclose(it): > if hasattr(type(it), "__iterclose__"): > type(it).__iterclose__(it) > > async def aiterclose(ait): > if hasattr(type(ait), "__aiterclose__"): > await type(ait).__aiterclose__(ait) > > These are particularly useful when implementing the changes in the next > section: > > > __iterclose__ implementations for iterator wrappers > --------------------------------------------------- > > Python ships a number of iterator types that act as wrappers around > other iterators: ``map``, ``zip``, ``itertools.accumulate``, > ``csv.reader``, and others. These iterators should define a > ``__iterclose__`` method which calls ``__iterclose__`` in turn on > their underlying iterators. For example, ``map`` could be implemented > as:: > > class map: > def __init__(self, fn, *iterables): > self._fn = fn > self._iters = [iter(iterable) for iterable in iterables] > > def __iter__(self): > return self > > def __next__(self): > return self._fn(*[next(it) for it in self._iters]) > > def __iterclose__(self): > for it in self._iters: > operator.iterclose(it) > > In some cases this requires some subtlety; for example, > ```itertools.tee`` > <https://docs.python.org/3/library/itertools.html#itertools.tee>`_ > should not call ``__iterclose__`` on the underlying iterator until it > has been called on *all* of the clone iterators. > > > Example / Rationale > ------------------- > > The payoff for all this is that we can now write straightforward code > like:: > > def read_newline_separated_json(path): > for line in open(path): > yield json.loads(line) > > and be confident that the file will receive deterministic cleanup > *without the end-user having to take any special effort*, even in > complex cases. For example, consider this silly pipeline:: > > list(map(lambda key: key.upper(), > doc["key"] for doc in read_newline_separated_json(path))) > > If our file contains a document where ``doc["key"]`` turns out to be > an integer, then the following sequence of events will happen: > > 1. ``key.upper()`` raises an ``AttributeError``, which propagates out > of the ``map`` and triggers the implicit ``finally`` block inside > ``list``. > 2. The ``finally`` block in ``list`` calls ``__iterclose__()`` on the > map object. > 3. ``map.__iterclose__()`` calls ``__iterclose__()`` on the generator > comprehension object. > 4. This injects a ``GeneratorExit`` exception into the generator > comprehension body, which is currently suspended inside the > comprehension's ``for`` loop body. > 5. The exception propagates out of the ``for`` loop, triggering the > ``for`` loop's implicit ``finally`` block, which calls > ``__iterclose__`` on the generator object representing the call to > ``read_newline_separated_json``. > 6. This injects an inner ``GeneratorExit`` exception into the body of > ``read_newline_separated_json``, currently suspended at the ``yield``. > 7. The inner ``GeneratorExit`` propagates out of the ``for`` loop, > triggering the ``for`` loop's implicit ``finally`` block, which calls > ``__iterclose__()`` on the file object. > 8. The file object is closed. > 9. The inner ``GeneratorExit`` resumes propagating, hits the boundary > of the generator function, and causes > ``read_newline_separated_json``'s ``__iterclose__()`` method to return > successfully. > 10. Control returns to the generator comprehension body, and the outer > ``GeneratorExit`` continues propagating, allowing the comprehension's > ``__iterclose__()`` to return successfully. > 11. The rest of the ``__iterclose__()`` calls unwind without incident, > back into the body of ``list``. > 12. The original ``AttributeError`` resumes propagating. > > (The details above assume that we implement ``file.__iterclose__``; if > not then add a ``with`` block to ``read_newline_separated_json`` and > essentially the same logic goes through.) > > Of course, from the user's point of view, this can be simplified down to > just: > > 1. ``int.upper()`` raises an ``AttributeError`` > 1. The file object is closed. > 2. The ``AttributeError`` propagates out of ``list`` > > So we've accomplished our goal of making this "just work" without the > user having to think about it. > > > Transition plan > =============== > > While the majority of existing ``for`` loops will continue to produce > identical results, the proposed changes will produce > backwards-incompatible behavior in some cases. Example:: > > def read_csv_with_header(lines_iterable): > lines_iterator = iter(lines_iterable) > for line in lines_iterator: > column_names = line.strip().split("\t") > break > for line in lines_iterator: > values = line.strip().split("\t") > record = dict(zip(column_names, values)) > yield record > > This code used to be correct, but after this proposal is implemented > will require an ``itertools.preserve`` call added to the first ``for`` > loop. > > [QUESTION: currently, if you close a generator and then try to iterate > over it then it just raises ``Stop(Async)Iteration``, so code the > passes the same generator object to multiple ``for`` loops but forgets > to use ``itertools.preserve`` won't see an obvious error -- the second > ``for`` loop will just exit immediately. Perhaps it would be better if > iterating a closed generator raised a ``RuntimeError``? Note that > files don't have this problem -- attempting to iterate a closed file > object already raises ``ValueError``.] > > Specifically, the incompatibility happens when all of these factors > come together: > > - The automatic calling of ``__(a)iterclose__`` is enabled > - The iterable did not previously define ``__(a)iterclose__`` > - The iterable does now define ``__(a)iterclose__`` > - The iterable is re-used after the ``for`` loop exits > > So the problem is how to manage this transition, and those are the > levers we have to work with. > > First, observe that the only async iterables where we propose to add > ``__aiterclose__`` are async generators, and there is currently no > existing code using async generators (though this will start changing > very soon), so the async changes do not produce any backwards > incompatibilities. (There is existing code using async iterators, but > using the new async for loop on an old async iterator is harmless, > because old async iterators don't have ``__aiterclose__``.) In > addition, PEP 525 was accepted on a provisional basis, and async > generators are by far the biggest beneficiary of this PEP's proposed > changes. Therefore, I think we should strongly consider enabling > ``__aiterclose__`` for ``async for`` loops and async generators ASAP, > ideally for 3.6.0 or 3.6.1. > > For the non-async world, things are harder, but here's a potential > transition path: > > In 3.7: > > Our goal is that existing unsafe code will start emitting warnings, > while those who want to opt-in to the future can do that immediately: > > - We immediately add all the ``__iterclose__`` methods described above. > - If ``from __future__ import iterclose`` is in effect, then ``for`` > loops and ``*`` unpacking call ``__iterclose__`` as specified above. > - If the future is *not* enabled, then ``for`` loops and ``*`` > unpacking do *not* call ``__iterclose__``. But they do call some other > method instead, e.g. ``__iterclose_warning__``. > - Similarly, functions like ``list`` use stack introspection (!!) to > check whether their direct caller has ``__future__.iterclose`` > enabled, and use this to decide whether to call ``__iterclose__`` or > ``__iterclose_warning__``. > - For all the wrapper iterators, we also add ``__iterclose_warning__`` > methods that forward to the ``__iterclose_warning__`` method of the > underlying iterator or iterators. > - For generators (and files, if we decide to do that), > ``__iterclose_warning__`` is defined to set an internal flag, and > other methods on the object are modified to check for this flag. If > they find the flag set, they issue a ``PendingDeprecationWarning`` to > inform the user that in the future this sequence would have led to a > use-after-close situation and the user should use ``preserve()``. > > In 3.8: > > - Switch from ``PendingDeprecationWarning`` to ``DeprecationWarning`` > > In 3.9: > > - Enable the ``__future__`` unconditionally and remove all the > ``__iterclose_warning__`` stuff. > > I believe that this satisfies the normal requirements for this kind of > transition -- opt-in initially, with warnings targeted precisely to > the cases that will be effected, and a long deprecation cycle. > > Probably the most controversial / risky part of this is the use of > stack introspection to make the iterable-consuming functions sensitive > to a ``__future__`` setting, though I haven't thought of any situation > where it would actually go wrong yet... > > > Acknowledgements > ================ > > Thanks to Yury Selivanov, Armin Rigo, and Carl Friedrich Bolz for > helpful discussion on earlier versions of this idea. > > -- > Nathaniel J. Smith -- https://vorpus.org > _______________________________________________ > Python-ideas mailing list > python...@python.org <javascript:> > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ >
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/