[
https://issues.apache.org/jira/browse/QPID-3759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13257314#comment-13257314
]
[email protected] commented on QPID-3759:
-----------------------------------------------------
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4383/
-----------------------------------------------------------
(Updated 2012-04-19 06:52:43.361572)
Review request for qpid, Andrew Stitcher, Ted Ross, Chug Rolke, and Steve
Huston.
Changes
-------
Load tests over a period of time reveal a threading bug when closing a
connection.
The testing for opsInprogress == 0 and the states of queuedDelete and
queuedClose occurs outside the lock. If an IO thread suspends right after
releasing the lock (opsInProgress == 1) and resumes some time later, when
another IO thread has decremented opsInProgress to zero, both threads will
conclude that they are the last IO completion. This results variously in
double deletes of the underlying socket or the AsynchIO object itself.
This patch moves the test inside the lock.
It also uses the same lock to protect the setting of either queuedDelete or
queuedClose and the handoff (if any) to the IO thread. This has the effect of
adding two additional locks over the life of the connection, but should have no
effect on throughput or latency.
Summary
-------
The cause of the hang was an outstanding read side completion when the AsynchIO
object in charge of the socket was in the queuedClose state.
The completion handler drains outstanding async requests before closing the
socket. Since the cable had been pulled, the async read would never complete
until Windows gave up on the socket altogether (some time much later).
This patch remembers the last aio read and will cancel it if in the
queuedClose state before blocking again.
Aside from the basic description from the Jira, I also removed an unused test
for restartRead, which doesn't change the logic of the section, but may
indicate an intention that wasn't fully coded or something left over from a
previous change.
This addresses bug QPID-3759.
https://issues.apache.org/jira/browse/QPID-3759
Diffs (updated)
-----
http://svn.apache.org/repos/asf/qpid/trunk/qpid/cpp/src/qpid/sys/windows/AsynchIO.cpp
1327776
Diff: https://reviews.apache.org/r/4383/diff
Testing
-------
qpid-perftest, qpid-send, qpid-receive, cable pulls, broker pause/resumes
Thanks,
Cliff
> Heartbeat timeout in Windows does not lead to timely reconnect
> --------------------------------------------------------------
>
> Key: QPID-3759
> URL: https://issues.apache.org/jira/browse/QPID-3759
> Project: Qpid
> Issue Type: Bug
> Components: C++ Client
> Affects Versions: 0.14
> Environment: Windows C++ messaging
> Reporter: Chuck Rolke
> Assignee: Cliff Jansen
> Fix For: 0.17
>
> Attachments: main.cpp
>
>
> Reported by Wolf Wolfswinkel on Qpid users
> http://qpid.2158936.n2.nabble.com/Heartbeats-in-C-broker-on-Windows-td7118702.html
> 22-Dec-2011
> The simplest test case is in attached main.cpp. Establish a good network
> connection to the broker and then start the program. It creates a connection,
> sends two messages, and then pauses for 15 seconds. During the pause
> disconnect the network connection to the broker for at least two heartbeat
> timeouts (12 seconds).
> After the heartbeat timeout the timer task fires and a debug trace shows:
> Traffic timeout, TCPConnector::abort, TCPConnector::eof, TCPConnector::close
> But the connection is not actually closed until something happens on the
> network to wake up the thread waiting in Poller::run().
> The timer event appears unable to interrupt the IO thread waiting for the
> completion port.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]