Hi, I spent last weeks on fixing issues specific to the Windows ProactorEventLoop. Even if the code "was working" in most cases, sometimes, I noticed strange warnings, bugs or crashs. Good news: all known issues are now fixed, and the test suite now pass and is stable!
Please test ProactorEventLoop as much as possible! My changes are merged in the development versions of Tulip, Trollius, Python 3.4 and Python 3.5. I added new tests. It should reduce the risk of regression. By the way, ProactorEventLoop now supports SSL on Python 3.5 and newer! -- I'm writing this email to try to keep a trace of the changes that I made to fix all these issues. Major changes: (1) IocpProactor.connect_pipe() was implemented using a thread which could not be interrupted. There were hacks in IocpProactor to workaround issues related to this. I rewrote the code using an explicit polling with an increasing delay between 1 ms and 100 ms. (2) I fixed IocpProactor.accept_pipe(). The function now uses the result of ConnectNamedPipe() to decide if we should register the overlapped operation to wait for its completion, or it is already done. I made a simiar change for IocpProactor.recv() (ReadFile() now raises an exception on broken pipe error). (3) I fixed the cancellation of the IocpProactor.wait_for_handle() future. -- I spent most of my time to try to fix the latest issue, the cancellation of wait_for_handle(). This issue was annoying because it emited unexpected completion. For example, a process was seen a terminated, while it was still running. It also emited sometimes "unexpected event" warnings. Sometimes, it simply crashed because Windows tried to write in a memory block which was release. I told you, a lot of fun. The internal machinery of the Windows RegisterWaitForSingleObject() function is very complex. Basically, RegisterWaitForSingleObject() is implemented with a blocking call which is called in a thread. The annoying point is that UnregisterWait() doesn't cancel immediatly the wait: it only "schedules" the cancellation. This point is not clear in the documentation, it took me hours to understand that. Ok, now it becomes funnier. UnregisterWaitEx() exists to be notified when the wait is cancelled: an event will be set. Ok, but how can we wait for this notification using an IOCP? Using RegisterWaitForSingleObject() again! What? To cancel a first RegisterWaitForSingleObject(), we have to call RegisterWaitForSingleObject() again on a new event? How can we cancel the second wait? ... To protect my head against an obvious explosion, I decided to deny the cancellation of the second kind of wait :-) Someone may find a more efficient way to wait for the cancellation of the first wait. I don't know enough all Windows internals. Maybe we should reimplement RegisterWaitForSingleObject() in Python to have a better control on threads and objects? I don't know yet if it would make sense to reimplement it. -- More details! RegisterWaitForSingleObject() is implemented as a pool of threads (500 max. by default). Each thread calls the blocking WaitForMultipleObjects() function, which can only wait for 64 objects. To be able to interact with these threads, each thread uses a timer (so each thread can only wait for 63 objects). It computes the next timeout of all registered wait operations. To modify the list of wait operations (RegisterWait..., UnregisterWait...), the timer is reset to wake up WaitForMultipleObjects(), and so wake up the thread. Since we are talking of threads, and even a pool of threads, all operations are asynchronous. RegisterWaitForSingleObject() may spawn a new thread, and UnregisterWait[Ex]() may stop a thread (which has nothing to do). FYI it's also possible to use UnregisterWaitEx() in blocking mode. It's not interesting in the context of asyncio. -- Full list of recent IOCP issues in Tulip and Python bug trackers. There are now all closed. "_WaitHandleFuture.cancel() crash if the wait event was already unregistered" https://code.google.com/p/tulip/issues/detail?id=195 "_OverlappedFuture.set_result() should clear the its reference to the overlapped object" https://code.google.com/p/tulip/issues/detail?id=196 "Rewrite IocpProactor.connect_pipe() with non-blocking calls to avoid non interruptible QueueUserWorkItem()" https://code.google.com/p/tulip/issues/detail?id=197 "Investigate IocpProactor.accept_pipe() special case (don't register overlapped)" https://code.google.com/p/tulip/issues/detail?id=204 "race condition when cancelling a _WaitHandleFuture" http://bugs.python.org/issue23095 "race condition related to IocpProactor.connect_pipe()" http://bugs.python.org/issue23293 Victor
