-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4383/
-----------------------------------------------------------
(Updated 2012-04-19 06:52:43.361572)
Review request for qpid, Andrew Stitcher, Ted Ross, Chug Rolke, and Steve
Huston.
Changes
-------
Load tests over a period of time reveal a threading bug when closing a
connection.
The testing for opsInprogress == 0 and the states of queuedDelete and
queuedClose occurs outside the lock. If an IO thread suspends right after
releasing the lock (opsInProgress == 1) and resumes some time later, when
another IO thread has decremented opsInProgress to zero, both threads will
conclude that they are the last IO completion. This results variously in
double deletes of the underlying socket or the AsynchIO object itself.
This patch moves the test inside the lock.
It also uses the same lock to protect the setting of either queuedDelete or
queuedClose and the handoff (if any) to the IO thread. This has the effect of
adding two additional locks over the life of the connection, but should have no
effect on throughput or latency.
Summary
-------
The cause of the hang was an outstanding read side completion when the AsynchIO
object in charge of the socket was in the queuedClose state.
The completion handler drains outstanding async requests before closing the
socket. Since the cable had been pulled, the async read would never complete
until Windows gave up on the socket altogether (some time much later).
This patch remembers the last aio read and will cancel it if in the
queuedClose state before blocking again.
Aside from the basic description from the Jira, I also removed an unused test
for restartRead, which doesn't change the logic of the section, but may
indicate an intention that wasn't fully coded or something left over from a
previous change.
This addresses bug QPID-3759.
https://issues.apache.org/jira/browse/QPID-3759
Diffs (updated)
-----
http://svn.apache.org/repos/asf/qpid/trunk/qpid/cpp/src/qpid/sys/windows/AsynchIO.cpp
1327776
Diff: https://reviews.apache.org/r/4383/diff
Testing
-------
qpid-perftest, qpid-send, qpid-receive, cable pulls, broker pause/resumes
Thanks,
Cliff