Re: [Openstack] Horizon and open connections

Rick Jones Thu, 31 Jan 2013 13:03:08 -0800

On 01/31/2013 11:59 AM, Gabriel Hurley wrote:

Even though I don't experience this problem (and prefer nginx to
apache), I can help diagnose:


Connections ending up in CLOSE_WAIT means that the socket isn't being
fully closed, which is controlled by the client lib (in this case
python-keystoneclient) which uses httplib2 under the hood.

Expanding on that a bit. CLOSE_WAIT is the state a TCP endpoint willenter upon receiving a FINished segment from the remote TCP. When theFIN arrives, the local application will receive notification of this viathe classic "read return of zero" on a receive/read call against thesocket. The FIN segment "means" "I will be sending you no more data."

Meanwhile, the local TCP will have ACKed the FIN segment, and the remoteTCP will transition to FIN_WAIT_2 upon receipt of that ACK (until thenit will be in FIN_WAIT_1).

Depending on how the remote application triggered the sending of theFIN, the TCP connection is now in a perfectly valid simplex statewherein the side in CLOSE_WAIT can continue sending data to the sidewhich will now be in FIN_WAIT_2. It is exceedingly rare forapplications to want a simplex TCP connection

If such a unidirectional TCP connection is not of any use to anapplication, (the common case)then that application should/must alsoclose the connection upon the read return of zero.

Thus, seeing lots of connections "stuck" in CLOSE_WAIT is an indicationof an application-level (relative to TCP) bug wherein the application onthe "CLOSE_WAIT side" is ignoring the read return of zero.


Such bugs in applications may be "masked" by a few things:

1) If the remote side called close() rather than shutdown(SHUT_WR) thenan attempt on the CLOSE_WAIT side to send data to the remote will causethe remote TCP to return a RST segment (reset) because there is nolonger anything above TCP to receive the data. This will then cause thelocal TCP to terminate the connection. This may also happen if thelocal application set SO_KEEPALIVE to enable TCP keepalives.

*) If the local side doesn't send anything, and doesn't have TCPkeepalives set, if the remote TCP has a FIN_WAIT_2 timer of some sortgoing (long story involving a hole in the TCP specification andimplementation workarounds, email if you want to hear it) then when thatFIN_WAIT_2 timer expires the remote TCP may sent a RST segment.

RST segments are "best effort" in sending - they don't get retransmittedexplicitly. In case 1 if the RST segment doesn't make it back, thelocal TCP will retransmit the data it was sending (because it will nothave received an ACKnowledgement either). It will then either receivethe RST triggered by that retransmission, or if no RSTs ever make itback, the local TCP will at some point reach its retransmission limitand terminate the connection. In case 2, if that one RST is lost,that's it, and the CLOSE_WAIT may remain forever.

Again though, given the rarity of actual application use of a simplexTCP connection, 99 times out of 10, seeing lots of CLOSE_WAITconnections building-up implies a buggy application or the librariesdoing work on its behalf.


rick jones

_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp

Re: [Openstack] Horizon and open connections

Reply via email to