This was originally posted on squid-users, but Amos asked me to send it to squid-dev. This email has been slightly updated from the original.

I've identified a problem with Squid 3.5.26 using a lot of memory when some broken clients are on the network. Strictly speaking this isn't really Squid's fault, but it is a denial of service mechanism so I wonder if Squid can help mitigate it.

The situation is this:

Squid is set up as a transparent proxy performing SSL bumping.
A client makes an HTTPS connection, which Squid intercepts. The client sends a TLS client handshake and squid responds with a handshake and the bumped certificate. The client doesn't like the bumped certificate, but rather than cleanly aborting the TLS session and then sending a TCP FIN, it just tears down the connection with a TCP RST packet.

Ordinarily, Squid's side of the connection would be torn down in response to the RST, so there would be no problem. But unfortunately, under high network loads the RST packet sometimes gets dropped and as far as Squid is concerned the connection never gets closed.

The busted clients I'm seeing the most problems with retry the connection immediately rather than waiting for a retry timer.


Problems:
1. A connection that hasn't completed the TLS handshake doesn't appear to ever time out (in this case, the server handshake and certificate exchange has been completed, but the key exchange never starts).

2. If the client sends an RST and the RST is lost, the client won't send another RST until Squid sends some data to it on the aborted connection. In this case, Squid is waiting for data from the client, which will never come, and will not send any new data to the client. Squid will never know that the client aborted the connection.

3. There is a lot of memory associated with each connection - my tests suggest around 1MB. In normal operation these kinds of dead connections can gradually stack up, leading to a slow but significant memory "leak"; when a really badly behaved client is on the network it can open tens of thousands of connections per minute and the memory consumption brings down the server.

4. We can expect similar problems with devices on flakey network connections, even when the clients are well behaved.


My thoughts:
Connections should have a reasonably short timeout during the TLS handshake - if a client hasn't completed the handshake and made an HTTP request over the encrypted connection within a few seconds, something is broken and Squid should tear down the connection. These connections certainly shouldn't be able to persist forever with neither side sending any data.


Testing:
I wrote a Python script (attached) that makes 1000 concurrent connections as quickly as it can and send a TLS client handshake over them. Once all of the connections are open, it then waits for responses from Squid (which would contain the server handshake and certificate) and quits, tearing down all of the the connections with an RST.

It seems that the RST packets for around 150-300 of those connections were dropped - this sounds surprising, but since all 1000 connections were aborted simultaneously, there would be a flood of RST packets and its probably reasonable to expect a significant number to be dropped. The end result was that netstat showed Squid still had about 150-300 established connections, which would never go away.

Amos has said he believes the connections should eventually time out (via the request_timeout option) but I don't think this is the case. request_timeout apparently defaults to 5 minutes, but these connections are still present on the Squid end over 2.5 hours after they were aborted. Example from the netstat output: tcp 0 0 ::ffff:34.233.104.170:443 ::ffff:192.168.8.5:51070 ESTABLISHED 29317/(squid-1)


--
  - Steve Hill
    Technical Director
    Opendium    Online Safety / Web Filtering    http://www.opendium.com

    Enquiries                 Support
    ---------                 -------
    sa...@opendium.com        supp...@opendium.com
    +44-1792-824568           +44-1792-825748
_______________________________________________
squid-users mailing list
squid-us...@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users
#!/usr/bin/python

import socket
import struct
import ssl
import sys

HOSTNAME="nexus-websocket-b.intercom.io"
PORT=443

HANDSHAKE=[0x16, 0x03, 0x01, 0x00, 0xc9, 0x01, 0x00, 0x00, 
0xc5, 0x03, 0x03, 0x55, 0xb0, 0x25, 0xb8, 0x90, 
0xbd, 0x9e, 0x2a, 0x78, 0xc9, 0xca, 0xe7, 0x97, 
0xbd, 0x37, 0x1a, 0x4c, 0xa2, 0x4a, 0x2b, 0x12, 
0xca, 0x51, 0x42, 0xe4, 0x1a, 0x8b, 0x5b, 0x74, 
0xd1, 0xc2, 0xa3, 0x00, 0x00, 0x1e, 0xc0, 0x2b, 
0xc0, 0x2f, 0xcc, 0xa9, 0xcc, 0xa8, 0xc0, 0x2c, 
0xc0, 0x30, 0xc0, 0x0a, 0xc0, 0x09, 0xc0, 0x13, 
0xc0, 0x14, 0x00, 0x33, 0x00, 0x39, 0x00, 0x2f, 
0x00, 0x35, 0x00, 0x0a, 0x01, 0x00, 0x00, 0x7e, 
0x00, 0x00, 0x00, 0x22, 0x00, 0x20, 0x00, 0x00, 
0x1d, 0x6e, 0x65, 0x78, 0x75, 0x73, 0x2d, 0x77, 
0x65, 0x62, 0x73, 0x6f, 0x63, 0x6b, 0x65, 0x74, 
0x2d, 0x61, 0x2e, 0x69, 0x6e, 0x74, 0x65, 0x72, 
0x63, 0x6f, 0x6d, 0x2e, 0x69, 0x6f, 0x00, 0x17, 
0x00, 0x00, 0xff, 0x01, 0x00, 0x01, 0x00, 0x00, 
0x0a, 0x00, 0x0a, 0x00, 0x08, 0x00, 0x1d, 0x00, 
0x17, 0x00, 0x18, 0x00, 0x19, 0x00, 0x0b, 0x00, 
0x02, 0x01, 0x00, 0x00, 0x23, 0x00, 0x00, 0x00, 
0x10, 0x00, 0x0e, 0x00, 0x0c, 0x02, 0x68, 0x32, 
0x08, 0x68, 0x74, 0x74, 0x70, 0x2f, 0x31, 0x2e, 
0x31, 0x00, 0x05, 0x00, 0x05, 0x01, 0x00, 0x00, 
0x00, 0x00, 0x00, 0x0d, 0x00, 0x18, 0x00, 0x16, 
0x04, 0x03, 0x05, 0x03, 0x06, 0x03, 0x08, 0x04, 
0x08, 0x05, 0x08, 0x06, 0x04, 0x01, 0x05, 0x01, 
0x06, 0x01, 0x02, 0x03, 0x02, 0x01]

# Construct the binary handshake message.  HANDSHAKE contains a copy of an
# SSL client handshake which was captured with Wireshark.
HANDSHAKE_STR=""
for h in HANDSHAKE:
	HANDSHAKE_STR += chr(h)


count=0
max_connections = 1000
socks = []

# Open max_connections connections to the remote server and send the handshake.
while len(socks) < max_connections:
	count += 1
	if count % 100 == 0:
		print count

	HOST = socket.getaddrinfo(HOSTNAME, PORT)[0][4][0]
	sock=socket.socket()

	# Turn off SO_LINGER so the socket gets an RST when closed instead
	# of cleanly shutting down with a FIN.
	l_onoff = 1
	l_linger = 0
	sock.setsockopt(socket.SOL_SOCKET, socket.SO_LINGER, struct.pack('ii', l_onoff, l_linger))

	sock.connect((HOST, PORT))
	sock.send(HANDSHAKE_STR)
	socks.append(sock)
	del sock

while len(socks) > 0:
	# Wait until we've received something from all sockets - this should
	# be the server handshake.
	socks.pop(0).recv(1)

# Script exits here and the sockets will all be torn down, emitting RSTs.

_______________________________________________
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev

Reply via email to