[ 
https://issues.apache.org/jira/browse/PROTON-1000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Moravec reopened PROTON-1000:
-----------------------------------

Reopening both PROTON-1000 and PROTON-1003: at least backport to 0.9 does not 
fix it. Reproducer:

{code}
#!/usr/bin/python

from time import sleep
from uuid import uuid4

from proton import ConnectionException, Timeout
from proton import SSLDomain, SSLException
#from proton import Message

from proton.utils import BlockingConnection

import random
import threading

ROUTER_ADDRESS = "amqps://dispatch-router:5671"
ADDRESS = "some_destination"
HEARTBEAT = 2
TIMEOUT = 3

class ReceiverThread(threading.Thread):
    def __init__(self,domain=None):
        super(ReceiverThread, self).__init__()
        self.domain=domain
        self.running = True

    def connect(self):
        self.conn = BlockingConnection(ROUTER_ADDRESS, ssl_domain=self.domain, 
heartbeat=HEARTBEAT)
        self.recv = self.conn.create_receiver(ADDRESS, name=str(uuid4()), 
dynamic=False, options=None)

    def run(self):
        while self.running:
            self.connect()
            while self.running:
                try:
                    msg = self.recv.receive(TIMEOUT)
                    if (msg):
                        print "message received: %s" % msg
                        self.recv.accept()
                except:
                    print "receiver failed to accept msg, reconnecting.."
                    try:
                        self.conn.close() # underlying TCP connection never gone
                    except:
                        print "receiver thread: failed to close connection"
                        pass
                    self.connect()

    def stop(self):
        self.running = False

ca_certificate='/etc/rhsm/ca/katello-default-ca.pem'
client_certificate='/etc/pki/consumer/bundle.pem'
client_key=None

domain = SSLDomain(SSLDomain.MODE_CLIENT)
domain.set_trusted_ca_db(ca_certificate)
domain.set_credentials(
  client_certificate,
  client_key or client_certificate, None)
domain.set_peer_authentication(SSLDomain.VERIFY_PEER)

rcv_thread = ReceiverThread(domain)
rcv_thread.start()
_in = raw_input("Press Enter to exit:")
rcv_thread.stop()
rcv_thread.join()
{code}

With SSL enabled (like above), there is an ESTABLISHED connection leak - `one 
per `receiver failed to accept msg, reconnecting` log - `self.conn.close()` has 
apparently no impact.

With SSL disabled (just set `ssl_domain=None`), there is a CLOSE_WAIT 
connection leak - again once per `receiver failed to accept msg, reconnecting` 
log.

> Connection leak on heartbeat-timeouted connections
> --------------------------------------------------
>
>                 Key: PROTON-1000
>                 URL: https://issues.apache.org/jira/browse/PROTON-1000
>             Project: Qpid Proton
>          Issue Type: Bug
>          Components: python-binding
>    Affects Versions: 0.9
>            Reporter: Pavel Moravec
>            Assignee: Gordon Sim
>             Fix For: 0.11
>
>
> Using gofer/katello-agent that uses BlockingConnection from Proton Reactor 
> with heartbeats set up, if some connection timeouts due to the heartbeats, 
> Proton does not close the TCP connection. That causes TCP connection leak, 
> despite gofer properly called BlockingConnection.close() and forgot any 
> reference to that class instance.
> Checking tcpdump, Proton simply ignores the timeouted connections - it does 
> not respond anyhow to the communication partner whatever it sends (in some 
> scenarios it sends some AMQP performative that Proton was assumed to respond, 
> in other scenario the communication peer dropped the TCP connection by 
> sending FIN+ACK packet but Proton didn't send FIN packet back - the only 
> stuff seen in tcpdump is ACKing on TCP layer made by OS, not by Proton). And 
> Proton ignores an attempt of Proton reactor to close the 
> connection/container, raising:
> Sep 21 15:02:35 my-capsule goferd: File 
> "/usr/lib64/python2.7/site-packages/proton/utils.py", line 263, in 
> on_transport_closed
> Sep 21 15:02:35 my-capsule goferd: raise ConnectionException("Connection %s 
> disconnected" % self.url);
> Sep 21 15:02:35 my-capsule goferd: ConnectionException: Connection 
> amqps://satellite.example.com:5647 disconnected
> for SSL connections, and raising:
> Sep 21 14:56:28 my-capsule goferd: File 
> "/usr/lib64/python2.7/site-packages/proton/utils.py", line 259, in 
> on_transport_tail_closed
> Sep 21 14:56:28 my-capsule goferd: self.on_transport_closed(event)
> Sep 21 14:56:28 my-capsule goferd: File 
> "/usr/lib64/python2.7/site-packages/proton/utils.py", line 263, in 
> on_transport_closed
> Sep 21 14:56:28 my-capsule goferd: raise ConnectionException("Connection %s 
> disconnected" % self.url);
> Sep 21 14:56:28 my-capsule goferd: ConnectionException: Connection 
> amqps://satellite.example.com:5647 disconnected
> (some difference between SSL and nonSSL could come from the fact that in my 
> case the server part - qdrouterd / Qpid Dispatch Router - sends FIN+ACK 
> packet for nonSSL connection, while it does not send anything for SSL 
> connection and continue for sending empty AMQP frames due to heartbeats 
> enabled forever)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to