Alex Rudyy created QPID-8276: -------------------------------- Summary: [Broker-J] Broker can leak closed NonBlockingConnection objects and eventually run out of heap memory Key: QPID-8276 URL: https://issues.apache.org/jira/browse/QPID-8276 Project: Qpid Issue Type: Improvement Components: Broker-J Affects Versions: qpid-java-broker-7.0.6, qpid-java-broker-7.0.5, qpid-java-broker-7.0.4, qpid-java-broker-7.1.0, qpid-java-6.1.7, qpid-java-broker-7.0.1, qpid-java-broker-7.0.0, qpid-java-broker-7.0.2, qpid-java-broker-7.0.3 Reporter: Alex Rudyy Fix For: qpid-java-broker-7.0.7, qpid-java-broker-7.1.1
The Qpid Broker-J can leak closed NonBlockingConnection objects. The heap dump analysis of impacted broker instance revealed that leaked {{NonBlockingConnection}} objects are accumulated in {{SelectorThread.SelectionTask#_unscheduledConnections}} belonging to AMQP port IO pool. They have no ticker set and no state changed flag set ({{NonBlockingConnection#isStateChanged() == false)}}. As result, the NonBlockingConnection objects are not removed from {{SelectorThread#_unscheduledConnections}} on invocation of {{SelectorThread.SelectionTask#processUnscheduledConnections()}} called from {{SelectorThread.SelectionTask#performSelect()}}. The {{NonBlockingConnection}} and underlying model object are in closed state. It seems that leaked {{NonBlockingConnection}} was closed as part of invocation {{NonBlockingConnection#doWork()}}. The connection was unregistered on {{VirtualHost}} IO pool and re-registered with port IO pool as part of invocation {{NetworkConnectionScheduler#processConnection}} At first, it was stored in collection {{SelectorThread.SelectionTask#_unregisteredConnections}}. Later on, it was moved from {{SelectorThread.SelectionTask#_unregisteredConnections}} to {{SelectorThread.SelectionTask#_unscheduledConnections}} as part of invocation {{SelectorThread.SelectionTask#reregisterUnregisteredConnections}} and stack there afterwards. The TLS transport was used in leaked connection, but, I think that connection with plain transport can be leaked as well. I suspect that connections were leaked in result of following scenario: * Invocation of {{SocketChannel#read(java.nio.ByteBuffer[])}} returned {{-1}} in {{NonBlockingConnection#readFromNetwork}}. * The flag {{NonBlockingConnection#_closed}} was set to {{true}}. The method {{ProtocolEngine#notifyWork()}} was not invoked to set {{state changed}} flag to {{true}} * The execution of {{NonBlockingConnection#doWork()}} ended up it connection shutdown (due to {{NonBlockingConnection#_closed}} being set) and following re-scheduling the connection on port IO scheduler. The latter resulted in connection being put into {{SelectorThread.SelectionTask#_unscheduledConnections}} as described above. It seems that opening and closing frequent connections with connection life span {{>10s}} (required for tickers to be removed) can ended-up in connections being leaked as described in scenario above. It looks like connections which are closed orderly or closed in result of {{IOException}} being thrown from socket read/write operation are not effected by the defect. The impacted broker instance can eventually crash with out of memory error. Broker memory monitoring and periodic broker restarts can mitigate the issue. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org For additional commands, e-mail: dev-h...@qpid.apache.org