New submission from Nathaniel Smith:

In trip_signal [1], the logic goes:

1) set the flag saying that this particular signal was tripped
2) write to the wakeup fd
3) set the global is_tripped flag saying "at least one signal was tripped", and 
do Py_AddPendingCall (which sets some global flags that the bytecode 
interpreter checks on every pass through the loop)

So the problem here is that it's step (2) that wakes up the main thread to 
check for signals, but it's step (3) that actually arranges for the 
Python-level signal handler to run. (Step (1) turns out to be irrelevant, 
because no-one looks at the per-signal flags unless the global is_tripped flag 
is set. This might be why no-one noticed this bug through code inspection 
though – I certainly missed it, despite explicitly checking for it several 
times!)

The result is that the following sequence of events is possible:

- signal arrives (e.g. SIGINT)
- trip_signal writes to the wakeup fd
- the main thread blocked in IO wait sees this, and wakes up
- the main thread checks for signals, and doesn't find any
- the main thread empties the wakeup fd
- the main thread goes back to sleep
- trip_signal sets the flags to request the Python-level signal handler be run
- the main thread doesn't notice, because it's asleep

It turns out that this is a real thing that can actually happen; it's causing 
an annoying intermittent failure in the trio testsuite on appveyor; and under 
the correct conditions I can reproduce it very reliably in my local Windows VM. 
See [2].

I think the fix is just to swap the order of steps (2) and (3), so we write to 
the wakeup fd last. Unfortunately I can't easily test this because I don't have 
a way to build CPython on Windows. But [2] has some IMHO pretty compelling 
evidence that this is what's happening.

[1] 
https://github.com/python/cpython/blob/6fab78e9027f9ebd6414995580781b480433e595/Modules/signalmodule.c#L238-L291
[2] https://github.com/python-trio/trio/issues/119

----------
messages: 291467
nosy: haypo, njs
priority: normal
severity: normal
status: open
title: Race condition in how trip_signal writes to wakeup fd
versions: Python 3.5, Python 3.6, Python 3.7

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30038>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to