Alex Rudyy created QPID-8276:
--------------------------------

             Summary: [Broker-J] Broker can leak closed NonBlockingConnection 
objects and eventually run out of heap memory
                 Key: QPID-8276
                 URL: https://issues.apache.org/jira/browse/QPID-8276
             Project: Qpid
          Issue Type: Improvement
          Components: Broker-J
    Affects Versions: qpid-java-broker-7.0.6, qpid-java-broker-7.0.5, 
qpid-java-broker-7.0.4, qpid-java-broker-7.1.0, qpid-java-6.1.7, 
qpid-java-broker-7.0.1, qpid-java-broker-7.0.0, qpid-java-broker-7.0.2, 
qpid-java-broker-7.0.3
            Reporter: Alex Rudyy
             Fix For: qpid-java-broker-7.0.7, qpid-java-broker-7.1.1


The Qpid Broker-J can leak closed NonBlockingConnection objects.

The heap dump analysis of impacted broker instance revealed that leaked 
{{NonBlockingConnection}} objects are accumulated in 
{{SelectorThread.SelectionTask#_unscheduledConnections}} belonging to AMQP port 
IO pool. They have no ticker set and no state changed flag set 
({{NonBlockingConnection#isStateChanged() == false)}}. As result, the 
NonBlockingConnection objects are not removed from 
{{SelectorThread#_unscheduledConnections}} on invocation of 
{{SelectorThread.SelectionTask#processUnscheduledConnections()}} called from 
{{SelectorThread.SelectionTask#performSelect()}}.

The {{NonBlockingConnection}} and underlying model object are in closed state.
 It seems that leaked {{NonBlockingConnection}} was closed as part of 
invocation {{NonBlockingConnection#doWork()}}. The connection was unregistered 
on {{VirtualHost}} IO pool and re-registered with port IO pool as part of 
invocation {{NetworkConnectionScheduler#processConnection}} At first, it was 
stored in collection {{SelectorThread.SelectionTask#_unregisteredConnections}}. 
Later on, it was moved from 
{{SelectorThread.SelectionTask#_unregisteredConnections}} to 
{{SelectorThread.SelectionTask#_unscheduledConnections}} as part of invocation 
{{SelectorThread.SelectionTask#reregisterUnregisteredConnections}} and stack 
there afterwards.

The TLS transport was used in leaked connection, but, I think that connection 
with plain transport can be leaked as well.

I suspect that connections were leaked in result of following scenario:
 * Invocation of {{SocketChannel#read(java.nio.ByteBuffer[])}} returned {{-1}} 
in {{NonBlockingConnection#readFromNetwork}}.
 * The flag {{NonBlockingConnection#_closed}} was set to {{true}}. The method 
{{ProtocolEngine#notifyWork()}} was not invoked to set {{state changed}} flag 
to {{true}}
 * The execution of {{NonBlockingConnection#doWork()}} ended up it connection 
shutdown (due to {{NonBlockingConnection#_closed}} being set) and following 
re-scheduling the connection on port IO scheduler. The latter resulted in 
connection being put into 
{{SelectorThread.SelectionTask#_unscheduledConnections}} as described above.

It seems that opening and closing frequent connections with connection life 
span {{>10s}} (required for tickers to be removed) can ended-up in connections 
being leaked as described in scenario above. It looks like connections which 
are closed orderly or closed in result of {{IOException}} being thrown from 
socket read/write operation are not effected by the defect.

The impacted broker instance can eventually crash with out of memory error. 
Broker memory monitoring and periodic broker restarts can mitigate the issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org

Reply via email to