HttpTunnelServlet leaks BlockingQueueTransport objects, causing eventual OOM on 
heap space
------------------------------------------------------------------------------------------

                 Key: AMQ-3021
                 URL: https://issues.apache.org/activemq/browse/AMQ-3021
             Project: ActiveMQ
          Issue Type: Bug
          Components: Broker, Transport
    Affects Versions: 5.4.1
            Reporter: Stirling Chow
            Priority: Critical


Symptom
========
We have a production system involving a network of 8 Brokers connected over 
HTTP.  The Brokers discover each other using SimpleDiscoveryAgent.  Our network 
experienced a period of instability during which time numerous Broker-to-Broker 
bridges were created and failed repeatedly.  Over the course of about 7 hours, 
two of the Brokers crashed with OOM heap space errors.

We analyzed the heap dump and discovered several thousand instances of 
org.apache.activemq.transport.http.BlockingQueueTransport.  These transports 
were associated with bridges that had failed, however, they were not being 
garbage collected because HttpTunnelServlet was maintaining references to them.

This issue was easily replicated in a test environment were we repeatedly broke 
the connection between a pair of Brokers connected over HTTP.  In each case, 
both Brokers maintained *indefinitely* a number of instances of 
BlockingQueueTransport equal to the number of times the network was interrupted.

Cause
=====
When a bridge is first created over HTTP, the client broker's 
HttpClientTransport sends a HEAD command to the server broker, which is 
processed by an instance of HttpTunnelServlet.  In response,e HttpTunnelServlet 
creates an instance of BlockingQueueTransport to represent the connection to 
the client broker.  This instance of BlockingQueueTransport is stored in a 
private hash map managed by HttpTunnelServlet and indexed by the client's 
unique ID:

public class HttpTunnelServlet extends HttpServlet {
...
    private final Map<String, BlockingQueueTransport> clients = new 
HashMap<String, BlockingQueueTransport>();
...

    protected BlockingQueueTransport createTransportChannel(HttpServletRequest 
request, HttpServletResponse response) throws IOException {
        String clientID = request.getHeader("clientID");
...
            answer = createTransportChannel();
            clients.put(clientID, answer);
...

Every time a client broker reestablishes a bridge, it generates a new clientID. 
 As a result, the clients hash map accumulates instances of 
BlockingQueueTransport, one for each bridge created.  Nowhere in the 
implementation of HttpTunnelServlet is there any code that removes the instance 
when a client broker is no longer connected.  In an environment with multiple 
brokers and an unreliable network, the client hash  map can accumulate 
thousands of instances of BlockingQueueTransport.

Solution
=======
HttpTunnelServlet needs to remove an instance of BlockingQueueTransport from 
the clients hash map whenever that instance is no longer being used.  The 
addition of InactivityMonitor as a default interceptor for the 
BlockingQueueTransport (see AMQ-2764) is a partial solution in that it triggers 
the closure of unused BlockingQueueTransport instances; however, 
HttpTunnelServlet does not detect these closures.

The solution is included a patch and involves the following changes to 
HttpTunnelServlet (not all changes are directly related to the OOM):
1) The addition of a ServiceListener to the BlockingQueueTransport, which is 
triggered when the transport is closed and causes the removal of the transport 
from the clients hash map
2) Refactoring of the access to the clients hash map to simplify thread safety 
(in particularly, removal of explicit synchronization in lieue of 
ConcurrentHashMap)
3) An additional check on the BlockingQueueTransport to ensure that it was not 
prematurely closed (the previous code ignored this possibility)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to