[issue30931] Race condition in asyncore may access the wrong dispatcher

STINNER Victor Thu, 27 Jul 2017 03:11:48 -0700

STINNER Victor added the comment:

Ok, wow, the discussion on this issue, bpo-30985, PR 2854, PR 2804, etc. was 
very productive. We identified much more race conditions than the first one 
that I spotted.



(Bug 1) The dispatcher A closes the dispatcher B. Currently, asyncore calls the 
handlers of the dispatcher B.

Technically, the close() method of a dispatcher closes immediately a socket 
object. So it's very likely that handlers of the dispatcher B fails on recv(), 
send(), or any other operation on the socket.

There is also a risk that somehow dispatcher B tries to closes again the socket 
(technically, socket, pipe or something else) and gets a OSError(EBADF) or 
worse: closes an unrelated socket!


(Bug 2) Similar to (Bug 1), but the subtle coming from the current 
implementation causes even worse bugs. The dispatcher A closes the dispatcher 
B, and then the dispatcher A (or another dispatcher, it doesn't matter who make 
that) creates a new dispatcher C which reuses the same file descriptor than the 
old closed file descriptor of dispatcher B.

The consequence is that handlers of dispatcher C are called even if the socket 
is not ready to read or to write. Since sockets are non-blocking, recv() and 
send() are likely to fail with BlockingIOError. These errors are usually 
correctly handled in asynchronous code, but they can cause bugs. More 
generally, calling handlers when the socket is not ready can cause many kinds 
of bugs.


(Bug 3) If a socket is ready to read and ready to write, handle_read() and 
handle_write() handlers are expected to be called. The problem is when 
handle_read() (first called handler) closes the dispatcher.

It was decided to not call handle_write() is that case.


(Bug 4) Ok, this one is the best one :-) When a socket is ready, asyncore calls 
methods like handle_write_event(). These methods usually call a single handler 
like handle_write(), *but* sometimes can call multiple handlers like 
handle_connect() followed by handle_write(). If a first handler closes the 
dispatcher, the following handler may fail badly on trying to use a closed 
socket.

IMHO (Bug 3) and (Bug 4) are less surprising than (Bug 1) or (Bug 2) since they 
are determistic and so less "race condition". Since these bugs are determistic, 
I expect that applications are already written to handle these corner cases, 
somehow. The simplest option is to catch and ignore errors, asyncore already 
ignores many errors for us. For example asyncore already ignores EBADF, whereas 
IMHO it hides a design issue (the 4 bugs I just listed).

Technically, fixing (Bug 3) using my PR 2854 design (check if map[fd] is still 
the same dispatcher) is safe since it doesn't rely at all on the actual 
implementation of asyncore handle_xxx_event() (which can be overriden) not 
handle_xxx() methods defined by third party code.

Fixing (Bug 4) is more complex since it requires to decide how to detect that a 
dispatcher was closed (it seems like "dispatcher._fileno is None" test is the 
best choice), and to modify handle_xxx_event() methods, whereas these methods 
can and *are* overridden in subclasses.

--

About backward compatibility, in short, fixing any of these bug changes the 
exact behaviour, since fixing all these bugs require to *not* call a handler if 
the dispatcher was closed. *If* a specific application requires that handlers 
are called anyway, changes are likely to break these applications.

A workaround for these applications is to copy asyncore.py from an old Python 
version to keep the exact same behaviour. Another smoother workaround is to 
monkey-patch poll() and poll2(), and maybe also override handle_xxx_event() 
methods, to restore the old behaviour.

I don't want to add a flag to re-enable the old behaviour, since it seems like 
we all agree that the described bugs are clearly very bad design bugs and that 
they must be fixed.

--

Another option, Nir will probably not like it, is to consider that asyncore is 
dead and will not be fixed.

That's not my favorite option. Nir explained that on Python 2.7, asyncore is 
the most convenient module for his use case, he deeply rely on it, and so "just 
want" to fix know bugs.

--

I suggest to use this bpo to fix (Bug 1), (Bug 2) and (Bug 3), since they can 
all be fixed in poll() and poll2() with the same design, like my PR 2854.

For (Bug 4), I suggest to use bpo-30985 where we already discussed many options 
how to detect that a dispatcher was closed. I have no strong opinion on this 
bug, if it should be fixed or not. IMHO it's less important and applications 
already know how handle the bug.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30931>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue30931] Race condition in asyncore may access the wrong dispatcher

Reply via email to