[
https://issues.apache.org/jira/browse/QPID-4286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alan Conway resolved QPID-4286.
-------------------------------
Resolution: Fixed
Committed Jason's patch on trunk:
------------------------------------------------------------------------
r1398530 | aconway | 2012-10-15 17:35:38 -0400 (Mon, 15 Oct 2012) | 6 lines
MQPID-4286: QMF queries for HA replication take too long to process (Jason
Dillaman)
Rework ManagementAgent locks, get rid of shared buffers that were points of
contention.
Minor log message improvements in ha code.
------------------------------------------------------------------------
> QMF queries for HA replication take too long to process
> -------------------------------------------------------
>
> Key: QPID-4286
> URL: https://issues.apache.org/jira/browse/QPID-4286
> Project: Qpid
> Issue Type: Bug
> Components: C++ Broker
> Affects Versions: 0.18
> Reporter: Jason Dillaman
> Assignee: Alan Conway
> Attachments: qpid-4286-fixes.patch, qpid-4286.patch
>
>
> In an HA broker with approximately 12,000 queues, it takes roughly 10-14
> seconds for the the first QMF response fragment to arrive. While the QMF
> management agent is collecting the response, all other QMF-related
> functionality is blocked -- which will block any thread that raises a QMF
> event.
> Not only will this result in clients getting disconnected from the broker due
> to worker threads being blocked by QMF (either due to missed heartbeats in an
> extreme case or from the 2 second handshake timeout), this also results in
> the HA backup's federated link getting disconnected due to missed heartbeats
> when the link heartbeat interval is set to a low value.
> If the HA backup loses its connection, it only exacerbates the issue since it
> will reconnect and re-query the QMF data that made it lose its connection in
> the first place.
> Recommend that QMF events not be blocked by a global management agent lock
> and also recommend that potentially long-running QMF queries be separated
> from the worker thread that initiated them to prevent a heartbeat timeout.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]