Alex Rudyy reassigned QPID-8276:

    Assignee: Alex Rudyy

> [Broker-J] Broker can leak closed NonBlockingConnection objects and 
> eventually run out of heap memory
> -----------------------------------------------------------------------------------------------------
>                 Key: QPID-8276
>                 URL: https://issues.apache.org/jira/browse/QPID-8276
>             Project: Qpid
>          Issue Type: Bug
>          Components: Broker-J
>    Affects Versions: qpid-java-broker-7.0.3, qpid-java-broker-7.0.2, 
> qpid-java-broker-7.0.0, qpid-java-broker-7.0.1, qpid-java-6.1.7, 
> qpid-java-broker-7.1.0, qpid-java-broker-7.0.4, qpid-java-broker-7.0.5, 
> qpid-java-broker-7.0.6
>            Reporter: Alex Rudyy
>            Assignee: Alex Rudyy
>            Priority: Critical
>             Fix For: qpid-java-broker-7.0.7, qpid-java-broker-7.1.1
> The Qpid Broker-J can leak closed NonBlockingConnection objects.
> The heap dump analysis of impacted broker instance revealed that leaked 
> {{NonBlockingConnection}} objects are accumulated in 
> {{SelectorThread.SelectionTask#_unscheduledConnections}} belonging to AMQP 
> port IO pool. They have no ticker set and no state changed flag set 
> ({{NonBlockingConnection#isStateChanged() == false)}}. As result, the 
> NonBlockingConnection objects are not removed from 
> {{SelectorThread#_unscheduledConnections}} on invocation of 
> {{SelectorThread.SelectionTask#processUnscheduledConnections()}} called from 
> {{SelectorThread.SelectionTask#performSelect()}}.
> The {{NonBlockingConnection}} and underlying model object are in closed state.
>  It seems that leaked {{NonBlockingConnection}} was closed as part of 
> invocation {{NonBlockingConnection#doWork()}}. The connection was 
> unregistered on {{VirtualHost}} IO pool and re-registered with port IO pool 
> as part of invocation {{NetworkConnectionScheduler#processConnection}} At 
> first, it was stored in collection 
> {{SelectorThread.SelectionTask#_unregisteredConnections}}. Later on, it was 
> moved from {{SelectorThread.SelectionTask#_unregisteredConnections}} to 
> {{SelectorThread.SelectionTask#_unscheduledConnections}} as part of 
> invocation {{SelectorThread.SelectionTask#reregisterUnregisteredConnections}} 
> and stack there afterwards.
> The TLS transport was used in leaked connection, but, I think that connection 
> with plain transport can be leaked as well.
> I suspect that connections were leaked in result of following scenario:
>  * Invocation of {{SocketChannel#read(java.nio.ByteBuffer[])}} returned 
> {{-1}} in {{NonBlockingConnection#readFromNetwork}}.
>  * The flag {{NonBlockingConnection#_closed}} was set to {{true}}. The method 
> {{ProtocolEngine#notifyWork()}} was not invoked to set {{state changed}} flag 
> to {{true}}
>  * The execution of {{NonBlockingConnection#doWork()}} ended up it connection 
> shutdown (due to {{NonBlockingConnection#_closed}} being set) and following 
> re-scheduling the connection on port IO scheduler. The latter resulted in 
> connection being put into 
> {{SelectorThread.SelectionTask#_unscheduledConnections}} as described above.
> It seems that opening and closing frequent connections with connection life 
> span {{>10s}} (required for tickers to be removed) can ended-up in 
> connections being leaked as described in scenario above. It looks like 
> connections which are closed orderly or closed in result of {{IOException}} 
> being thrown from socket read/write operation are not effected by the defect.
> The impacted broker instance can eventually crash with out of memory error. 
> Broker memory monitoring and periodic broker restarts can mitigate the issue.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org

Reply via email to