Jason Dillaman created QPID-4286:
------------------------------------
Summary: QMF queries for HA replication take too long to process
Key: QPID-4286
URL: https://issues.apache.org/jira/browse/QPID-4286
Project: Qpid
Issue Type: Bug
Components: C++ Broker
Affects Versions: 0.18
Reporter: Jason Dillaman
In an HA broker with approximately 12,000 queues, it takes roughly 10-14
seconds for the the first QMF response fragment to arrive. While the QMF
management agent is collecting the response, all other QMF-related
functionality is blocked -- which will block any thread that raises a QMF
event.
Not only will this result in clients getting disconnected from the broker due
to worker threads being blocked by QMF (either due to missed heartbeats in an
extreme case or from the 2 second handshake timeout), this also results in the
HA backup's federated link getting disconnected due to missed heartbeats when
the link heartbeat interval is set to a low value.
If the HA backup loses its connection, it only exacerbates the issue since it
will reconnect and re-query the QMF data that made it lose its connection in
the first place.
Recommend that QMF events not be blocked by a global management agent lock and
also recommend that potentially long-running QMF queries be separated from the
worker thread that initiated them to prevent a heartbeat timeout.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]