On Mon, 18 Nov 2019 at 08:42, Paul Moore <p.f.mo...@gmail.com> wrote:
>
> On Sun, 17 Nov 2019 at 19:18, Oscar Benjamin <oscar.j.benja...@gmail.com> 
> wrote:
> >
> > Ultimately the problem is that the requirements on a context manager
> > are not clearly spelled out. The with statement gives context manager
> > authors a strong guarantee that if __enter__ returns successfully then
> > __exit__ will be called at some point later. There needs to be a
> > reverse requirement on context manager authors to guarantee that it is
> > not necessary to call __exit__ whenever __enter__ has not been called.
> > With the protocol requirements specified in both directions it would
> > be easy to make utilities like nested for combining context managers
> > in different ways.
>
> The context here has been lost - I've searched the thread and I can't
> find a proper explanation of how open() "misbehaves" in any way that
> seems to relate to this statement (I don't actually see any real
> explanation of any problem with open() to be honest). There's some
> stuff about what happens if open() itself fails, but I don't see how
> that results in a problem (as opposed to something like a subtle
> application error because the writer didn't realise this could
> happen).
>
> Can someone restate the problem please?

Sorry Paul! I think a small number of us were following a sub-thread
here where we understood what we were talking about but it wasn't
clearly spelt out anywhere. I introduced the word "misbehave" so I'll
clarify what I meant. First I'll describe all of the background:

Python 2.5 introduced the with statement from PEP 343 and made file
objects into context managers by adding __enter__ and __exit__
methods. This means that the object returned by open can be used in a
with statement like

with open(filename) as fin:
    ...

The contextlib module was also added in Python 2.5 and included a
useful utility called nested:
https://docs.python.org/2.7/library/contextlib.html#contextlib.nested

The idea with nested is that you could flatten nested with statements so

with mgr1:
    with mgr2:
        ...

can be rewritten as

with nested(mgr1, mgr2):
    ...

This means that you don't have so much indentation and since nested
takes *args you can use an arbitrary number of context managers. This
was deprecated essentially because it leads to this construction

with nested(open(file1), open(file2)) as (f1, f2):
    ...

Here before nested is called its arguments are prepared from left to
right so first file1 is opened and then file2 is opened and then both
are passed to nested. If an exception is raised while attempting to
open file2 then the file object returned for file1 doesn't get passed
to nested and doesn't get used in any with statement so its __enter__
and __exit__ methods are never called.

In this simple example the file object will probably be closed by
__del__ but a significant part of the point of context managers is
that we don't want to rely on __del__ in general. Also forms that are
otherwise equivalent won't necessarily lead to __del__ being called
e.g.:

f1 = open(file1)
f2 = open(file2)
with nested(f1, f2):
    ...

Since this "deficiency" of nested is about an exception that is raised
before nested is even called it clearly wasn't possible to solve this
problem by improving nested itself. So Python 2.6 introduced the
multiple with statement:

with open(file1) as f1, open(file2) as f2:
    ...

Since this is now built in to the with statement rather than using a
function it is possible to evaluate things in a different order so
e.g. f1.__enter__ here is called before open(file2) which wouldn't be
possible with a utility function like nested. Most importantly
f1.__exit__ will be called if open(file2) raises which solves the main
problem with nested. Then the nested function was deprecated in Python
2.7 and at some point removed altogether.

The multiple with statement has problems as well though. One problem
is the syntax limitation which is the subject of the OP in this
thread. The other is the inability to take an arbitrary number of
context managers as nested could with *args.

Alternatives to nested can not be used as cleanly though if they are
expected to meet this requirement that they should do the right thing
with exceptions raised while creating the arguments (before the
function is called!). With that constraint in mind it isn't possible
to have any utility for multiple with statements that receives more
than one context manager at a time. Hence exit stack can be used as it
creates an object that only receives context managers one at a time:
https://docs.python.org/3/library/contextlib.html#contextlib.ExitStack
The example given in the docs there explicitly includes open to show
you the kind of problem it is designed to solve:

with ExitStack() as stack:
    files = [stack.enter_context(open(fname)) for fname in filenames]
    # All opened files will automatically be closed at the end of
    # the with statement, even if attempts to open files later
    # in the list raise an exception

To me that seems clumsy and awkward compared to nested though:

with nested(*map(open, filenames)) as files:
    ...

Ideally I would design nested to take an iterable rather than *args
and then it would be fine to do e.g.

with nested(open(filename) for filename in filenames) as files:
    ...

Here nested could take advantage of the delayed evaluation in the
generator expression to invoke the __enter__ methods and call __exit__
on the opened files if any of the open calls fails. This would also
leave a "trap" though since using a list comprehension would suffer
the same problem as if nested took *args:

with nested([open(filename) for filename in filenames]) as files:
    ...

That's the background so what is it that we are discussing in this subthread?

I am proposing the root of the problem here is the fact that open
acquires its resource (the opened file descriptor) before __enter__ is
called. This is what I mean by a context manager that "misbehaves". If
there was a requirement on context managers that __exit__ cleans up
after __enter__ and any resource that needs cleaning up should only be
acquired in __enter__ then there would never have been a problem with
nested.

In particular PEP 343 gives an alternative to the current behaviour of open:
which is

@contextmanager
def opened(filename, mode="r"):
    f = open(filename, mode)
    try:
        yield f
    finally:
        f.close()

https://www.python.org/dev/peps/pep-0343/#examples

Because this uses the contextmanager decorator it may not immediately
be obvious but this function does not suffer any of the problems
described above. That is because what this returns is not a file
object but rather an object that can only be used as a context
manager. It is the __enter__ method of this context manager that opens
the file and returns a usable file object. Here is a simple
demonstration:

>>> from contextlib import contextmanager
>>> @contextmanager
... def f():
...     print(1) # Executed on __enter__
...     try:
...         yield 3
...     finally:
...         pass
...
>>> f()
<contextlib._GeneratorContextManager object at 0x10786be10>
>>> f().__enter__()
1
3

That means that there is no problem with using

with nested(opened(filename1), opened(filename2)) as (file1, file2):
    ...

or any of the variations on this above. For whatever reason this is
not what was released in Python 2.5 which instead added the __enter__
and __exit__ methods to file objects themselves so that the existing
open builtin could be used directly with the with statement.

What I am saying is that conceived as a context manager the object
returned by open misbehaves. I think that not just nested but a number
of other convenient utilities and patterns could have been possible if
opened has been used instead of open and if context managers were
expected to meet the constraint:
"""
There should be no need to call __exit__ if __enter__ has not been called.
"""
Of course a lot of time has passed since then and now there are
probably many other misbehaving context managers so it might be too
late to do anything about that.


Oscar
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/KIYSRWSDURSP7GC6SIZBUMRZRZBUK27P/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to