[ https://issues.apache.org/jira/browse/PROTON-1000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Pavel Moravec reopened PROTON-1000: ----------------------------------- Reopening both PROTON-1000 and PROTON-1003: at least backport to 0.9 does not fix it. Reproducer: {code} #!/usr/bin/python from time import sleep from uuid import uuid4 from proton import ConnectionException, Timeout from proton import SSLDomain, SSLException #from proton import Message from proton.utils import BlockingConnection import random import threading ROUTER_ADDRESS = "amqps://dispatch-router:5671" ADDRESS = "some_destination" HEARTBEAT = 2 TIMEOUT = 3 class ReceiverThread(threading.Thread): def __init__(self,domain=None): super(ReceiverThread, self).__init__() self.domain=domain self.running = True def connect(self): self.conn = BlockingConnection(ROUTER_ADDRESS, ssl_domain=self.domain, heartbeat=HEARTBEAT) self.recv = self.conn.create_receiver(ADDRESS, name=str(uuid4()), dynamic=False, options=None) def run(self): while self.running: self.connect() while self.running: try: msg = self.recv.receive(TIMEOUT) if (msg): print "message received: %s" % msg self.recv.accept() except: print "receiver failed to accept msg, reconnecting.." try: self.conn.close() # underlying TCP connection never gone except: print "receiver thread: failed to close connection" pass self.connect() def stop(self): self.running = False ca_certificate='/etc/rhsm/ca/katello-default-ca.pem' client_certificate='/etc/pki/consumer/bundle.pem' client_key=None domain = SSLDomain(SSLDomain.MODE_CLIENT) domain.set_trusted_ca_db(ca_certificate) domain.set_credentials( client_certificate, client_key or client_certificate, None) domain.set_peer_authentication(SSLDomain.VERIFY_PEER) rcv_thread = ReceiverThread(domain) rcv_thread.start() _in = raw_input("Press Enter to exit:") rcv_thread.stop() rcv_thread.join() {code} With SSL enabled (like above), there is an ESTABLISHED connection leak - `one per `receiver failed to accept msg, reconnecting` log - `self.conn.close()` has apparently no impact. With SSL disabled (just set `ssl_domain=None`), there is a CLOSE_WAIT connection leak - again once per `receiver failed to accept msg, reconnecting` log. > Connection leak on heartbeat-timeouted connections > -------------------------------------------------- > > Key: PROTON-1000 > URL: https://issues.apache.org/jira/browse/PROTON-1000 > Project: Qpid Proton > Issue Type: Bug > Components: python-binding > Affects Versions: 0.9 > Reporter: Pavel Moravec > Assignee: Gordon Sim > Fix For: 0.11 > > > Using gofer/katello-agent that uses BlockingConnection from Proton Reactor > with heartbeats set up, if some connection timeouts due to the heartbeats, > Proton does not close the TCP connection. That causes TCP connection leak, > despite gofer properly called BlockingConnection.close() and forgot any > reference to that class instance. > Checking tcpdump, Proton simply ignores the timeouted connections - it does > not respond anyhow to the communication partner whatever it sends (in some > scenarios it sends some AMQP performative that Proton was assumed to respond, > in other scenario the communication peer dropped the TCP connection by > sending FIN+ACK packet but Proton didn't send FIN packet back - the only > stuff seen in tcpdump is ACKing on TCP layer made by OS, not by Proton). And > Proton ignores an attempt of Proton reactor to close the > connection/container, raising: > Sep 21 15:02:35 my-capsule goferd: File > "/usr/lib64/python2.7/site-packages/proton/utils.py", line 263, in > on_transport_closed > Sep 21 15:02:35 my-capsule goferd: raise ConnectionException("Connection %s > disconnected" % self.url); > Sep 21 15:02:35 my-capsule goferd: ConnectionException: Connection > amqps://satellite.example.com:5647 disconnected > for SSL connections, and raising: > Sep 21 14:56:28 my-capsule goferd: File > "/usr/lib64/python2.7/site-packages/proton/utils.py", line 259, in > on_transport_tail_closed > Sep 21 14:56:28 my-capsule goferd: self.on_transport_closed(event) > Sep 21 14:56:28 my-capsule goferd: File > "/usr/lib64/python2.7/site-packages/proton/utils.py", line 263, in > on_transport_closed > Sep 21 14:56:28 my-capsule goferd: raise ConnectionException("Connection %s > disconnected" % self.url); > Sep 21 14:56:28 my-capsule goferd: ConnectionException: Connection > amqps://satellite.example.com:5647 disconnected > (some difference between SSL and nonSSL could come from the fact that in my > case the server part - qdrouterd / Qpid Dispatch Router - sends FIN+ACK > packet for nonSSL connection, while it does not send anything for SSL > connection and continue for sending empty AMQP frames due to heartbeats > enabled forever) -- This message was sent by Atlassian JIRA (v6.3.4#6332)