On 18 July 2016 at 23:40, Barry Warsaw <ba...@python.org> wrote: > I was trying to debug a problem in some work code and I ran into some > interesting oddities with contextlib.ExitStack and other context managers in > Python 3.5. > > This program creates a temporary directory, and I wanted to give it a --keep > flag to not automatically delete the tempdir at program exit. I was using an > ExitStack to manage a bunch of resources, and the temporary directory is the > first thing pushed into the ExitStack. At that point in the program, I check > the value of --keep and if it's set, I use ExitStack.pop_all() to clear the > stack, and thus, presumably, prevent the temporary directory from being > deleted. > > There's this relevant quote in the contextlib documentation: > > """ > Each instance [of an ExitStack] maintains a stack of registered callbacks that > are called in reverse order when the instance is closed (either explicitly or > implicitly at the end of a with statement). Note that callbacks are not > invoked implicitly when the context stack instance is garbage collected. > """ > > However if I didn't save the reference to the pop_all'd ExitStack, the tempdir > would be immediately deleted. If I did save a reference to the pop_all'd > ExitStack, the tempdir would live until the saved reference went out of scope > and got refcounted away. > > As best I can tell this happens because TemporaryDirectory.__init__() creates > a weakref finalizer which ends up calling the _cleanup() function. Although > it's rather difficult to trace, it does appear that when the ExitStack is > gc'd, this finalizer gets triggered (via weakref.detach()), thus causing the > _cleanup() method to be called and the tmpdir to get deleted.
This seems to be missing from the documentation, but when you delete a TemporaryDirectory instance without using a context manager or cleanup(), it complains, and then cleans up anyway: >>> d = TemporaryDirectory() >>> del d /usr/lib/python3.5/tempfile.py:797: ResourceWarning: Implicitly cleaning up <TemporaryDirectory '/tmp/tmpfzf4g96l'> _warnings.warn(warn_message, ResourceWarning) Perhaps you will have to use the lower-level mkdtemp() function instead if you want the option of making the “temporary” directory long-lived. > I "fix" this by > doing: > > def __init__(self): > tmpdir = TemporaryDirectory() > self._tmpdir = (tmpdir.name if keep > else self.resources.enter_context(tmpdir)) > > There must be more to the story because when __init__() exits in the --keep > case, tmpdir should have gotten refcounted away and the directory deleted, but > it doesn't. I haven't dug down deep enough to figure that out. > > Now, while I was debugging that behavior, I ran across more interesting bits. > I put this in a file to drive some tests: > > ------snip snip----- > with ExitStack() as resources: > print('enter context') > tmpdir = resources.enter_context(X()) > resources.pop_all() > print('exit context') > ------snip snip----- > > Let's say X is: > > class X: > def __enter__(self): > print('enter Foo') > return self > > def __exit__(self, *args, **kws): > print('exit Foo') > return False > > the output is: > > enter context > enter Foo > exit context > > So far so good. A fairly standard context manager class doesn't get its > __exit__() called even when the program exits. Let's try this: > > @contextmanager > def X(): > print('enter bar') > yield > print('exit bar') > > still good: > > enter context > enter bar > exit context > > Let's modify X a little bit to be a more common idiom: > > @contextmanager > def X(): > print('enter foo') > try: > yield > finally: > print('exit foo') > > enter context > enter foo > exit foo > exit context > > Ah, the try-finally changes the behavior! There's probably some documentation > somewhere that defines how a generator gets finalized, and that triggers the > finally clause, whereas in the previous example, nothing after the yield gets > run. I just can't find that anything that would describe the observed > behavior. I suspect the documentation doesn’t spell everything out, but my understanding is that garbage collection of a generator instance effectively calls its close() method, triggering any “finally” and __exit__() handlers. IMO, in some cases if a generator would execute these handlers and gets garbage collected, it is a programming error because the programmer should have explicitly called generator.close(). In these cases, it would be nice to emit a ResourceWarning, just like forgetting to close a file, or delete your temporay directory above. But maybe there are other cases where there is no valid reason to emit a warning. I have hesitated in suggesting this change in the past, but I don’t remember why. One reason is it might be an annoyance with code that also wants to handle non-generator iterators that don’t have a close() method. In a world before “yield from”, imagine having to change: # Rough equivalent of “yield from g(...)” for item in g(...): yield item to: with contextlib.closing(g(...)) as subgen: for item in subgen: yield item It would be even worse if you had to support g() returning some other iterator not supporting close(). > It's all very twisty, and I'm not sure Python is doing anything wrong, but I'm > also not sure it's *not* doing anything wrong. ;) > > In any case, the contextlib documentation quoted above should probably be more > liberally sprinkled with salty caveats. Just calling .pop_all() isn't > necessarily enough to ensure that resources managed by an ExitStack will > survive its garbage collection. _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com