[
https://issues.apache.org/jira/browse/QPID-8276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alex Rudyy reassigned QPID-8276:
--------------------------------
Assignee: Alex Rudyy
> [Broker-J] Broker can leak closed NonBlockingConnection objects and
> eventually run out of heap memory
> -----------------------------------------------------------------------------------------------------
>
> Key: QPID-8276
> URL: https://issues.apache.org/jira/browse/QPID-8276
> Project: Qpid
> Issue Type: Bug
> Components: Broker-J
> Affects Versions: qpid-java-broker-7.0.3, qpid-java-broker-7.0.2,
> qpid-java-broker-7.0.0, qpid-java-broker-7.0.1, qpid-java-6.1.7,
> qpid-java-broker-7.1.0, qpid-java-broker-7.0.4, qpid-java-broker-7.0.5,
> qpid-java-broker-7.0.6
> Reporter: Alex Rudyy
> Assignee: Alex Rudyy
> Priority: Critical
> Fix For: qpid-java-broker-7.0.7, qpid-java-broker-7.1.1
>
>
> The Qpid Broker-J can leak closed NonBlockingConnection objects.
> The heap dump analysis of impacted broker instance revealed that leaked
> {{NonBlockingConnection}} objects are accumulated in
> {{SelectorThread.SelectionTask#_unscheduledConnections}} belonging to AMQP
> port IO pool. They have no ticker set and no state changed flag set
> ({{NonBlockingConnection#isStateChanged() == false)}}. As result, the
> NonBlockingConnection objects are not removed from
> {{SelectorThread#_unscheduledConnections}} on invocation of
> {{SelectorThread.SelectionTask#processUnscheduledConnections()}} called from
> {{SelectorThread.SelectionTask#performSelect()}}.
> The {{NonBlockingConnection}} and underlying model object are in closed state.
> It seems that leaked {{NonBlockingConnection}} was closed as part of
> invocation {{NonBlockingConnection#doWork()}}. The connection was
> unregistered on {{VirtualHost}} IO pool and re-registered with port IO pool
> as part of invocation {{NetworkConnectionScheduler#processConnection}} At
> first, it was stored in collection
> {{SelectorThread.SelectionTask#_unregisteredConnections}}. Later on, it was
> moved from {{SelectorThread.SelectionTask#_unregisteredConnections}} to
> {{SelectorThread.SelectionTask#_unscheduledConnections}} as part of
> invocation {{SelectorThread.SelectionTask#reregisterUnregisteredConnections}}
> and stack there afterwards.
> The TLS transport was used in leaked connection, but, I think that connection
> with plain transport can be leaked as well.
> I suspect that connections were leaked in result of following scenario:
> * Invocation of {{SocketChannel#read(java.nio.ByteBuffer[])}} returned
> {{-1}} in {{NonBlockingConnection#readFromNetwork}}.
> * The flag {{NonBlockingConnection#_closed}} was set to {{true}}. The method
> {{ProtocolEngine#notifyWork()}} was not invoked to set {{state changed}} flag
> to {{true}}
> * The execution of {{NonBlockingConnection#doWork()}} ended up it connection
> shutdown (due to {{NonBlockingConnection#_closed}} being set) and following
> re-scheduling the connection on port IO scheduler. The latter resulted in
> connection being put into
> {{SelectorThread.SelectionTask#_unscheduledConnections}} as described above.
> It seems that opening and closing frequent connections with connection life
> span {{>10s}} (required for tickers to be removed) can ended-up in
> connections being leaked as described in scenario above. It looks like
> connections which are closed orderly or closed in result of {{IOException}}
> being thrown from socket read/write operation are not effected by the defect.
> The impacted broker instance can eventually crash with out of memory error.
> Broker memory monitoring and periodic broker restarts can mitigate the issue.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]