On Tue, Aug 8, 2017 at 2:54 AM, Jonathan Slenders <jonat...@slenders.be> wrote:
> Hi all,
> Is it possible that thread.join() cannot be interrupted on Windows, while it
> can be on Linux?
> Would this be a bug, or is it by design?
> import threading, time
> def wait():
>     time.sleep(1000)
> t = threading.Thread(target=wait)
> t.start()
> t.join()  # Press Control-C now. It stops on Linux, while it hangs on
> Windows.

This comes down to a difference in how the Linux and Windows low-level
APIs handle control-C and blocking functions: on Linux, the default is
that any low-level blocking function can be interrupted by a control-C
or any other random signal, and it's the calling code's job to check
for this and restart it if necessary. This is annoying because it
means that every low-level function call inside the Python interpreter
has to have a bunch of boilerplate to detect this and retry, but OTOH
it means that control-C automatically works in (almost) all cases. On
Windows, they made the opposite decision: low-level blocking functions
are never automatically interrupted by control-C. It's a reasonable
design choice. The advantage is that sloppily written programs tend to
work better -- on Linux you kind of *have* to put a retry loop around
*every* low level call or your program will suffer weird random bugs,
and on Windows you don't.

But for carefully written programs like CPython this is actually
pretty annoying, because if you *do* want to wake up on a control-C,
then on Windows that has to be laboriously implemented on a
case-by-case basis for each blocking function, and often this requires
some kind of cleverness or is actually impossible, depending on what
function you want to interrupt. At least on Linux the retry loop is
always the same.

The end result is that on Windows, control-C almost never works to
wake up a blocked Python process, with a few special exceptions where
someone did the work to implement this. On Python 2 the only functions
that have this implemented are time.sleep() and
multiprocessing.Semaphore.acquire; on Python 3 there are a few more
(you can grep the source for _PyOS_SigintEvent to find them), but
Thread.join isn't one of them.

It looks like Thread.join ultimately ends up blocking in
Python/thread_nt.h:EnterNonRecursiveMutex, which has a maze of #ifdefs
behind it -- I think there are 3 different implementation you might
end up with, depending on how CPython was built? Two of them seem to
ultimately block in WaitForSingleObject, which would be easy to adapt
to handle control-C. Unfortunately I think the implementation that
actually gets used on modern systems is the one that blocks in
SleepConditionVariableSRW, and I don't see any easy way for a
control-C to interrupt that. But maybe I'm missing something -- I'm
not a Windows expert.


