On 1/12/2012 1:31 p.m., Henrik Nordström wrote:
fre 2012-11-30 klockan 15:30 -0700 skrev Alex Rousskov:

     Squid is sending POST requests on reused pinned connections, and
some of those requests fail due to a pconn race, with no possibility for
a retry.
Yes... and we have to for NTLM, TPROXY and friends or they get in a bit
of trouble from connection state mismatch.

If sending the request fails we should propagate this to the client by
resetting the client connection and let the client retry.

It seems to me we are also forced to do this for ssl-bump connections.
* Opening a new connection is the wrong thing to do for server-first bumped connections, where the new connection MAY go to a completely different server than the one whose certificate was bumped with. We control the IP:port we connect to, but we cannot control IP-level load balancers existence. * client-first bumped connections do not face the lag, *BUT* there is no way to identify them at forwarding time separately from server-first bumped. * we are pretending to be a dumb relay - which offers the ironclad guarantee that the server at the other end is a single TCP endpoint (DNS uncertainty is only on the initial setup. Once connected packets reach *an* endpoint they all do or the connection dies).


We can control the outgoing IP:port details, but have no control over the existence of IP-level load balancers which can screw with the destination server underneath us. Gambling on the destination not changing on an HTTPS outbound when retrying for intercepted traffic will re-opening at least two CVE issues 3.2 is supposed to be immune to (CVE-2009-0801 and CVE-2009-3555).

Races are also still very possible on server-bumped connections if for any reason it takes longer to receive+parse+adapt+reparse the client request than the server wants to wait for. Remember we have all the slow trickle arrival of headers, parsing, adaptation, helpers and access controls to work though before it gets to use the pinned server conn. For example Squid is extremely likely to lose closure races on a mobile network when some big event is on that everyone has to google/twitter/facebook about while every request gets bumped and sent through an ICAP filter (BBC at the London Olympics).


When using SslBump, the HTTP request is always forwarded using a server
connection "pinned" to the HTTP client connection. Squid does not reuse
a persistent connection from the idle pconn pool for bumped client
requests.
Ok.

  Squid uses the dedicated pinned server connection instead.
This bypasses pconn race controls even though Squid may be essentially
reusing an idle HTTP connection and, hence, may experience the same kind
of race conditions.
Yes..

However, connections that were just pinned, without sending any
requests, are not "essentially reused idle pconns" so we must be careful
to allow unretriable requests on freshly pinned connections.
?

A straight usage counter is deftinitely the wrong thing to use to control this whether or not you agree with us that re-trying outbound connections is safe after guaranteeing teh clietn (with encryption certificate no less) that a single destinatio has been setup. What is needed is a suitable length idle timeout and a close handler. Both of which for bumped connections should trigger un-pinning and abort the client connection. If the timouts are not being set on server-bump pinned connections then that is the bug and needs to be fixed ASAP.


The issue is not that the conn was used then pooled versus pinned. The issue is that async period between last and current packet on the socket - we have no way to identify if the duration between has caused problems (crtd, adaptation or ACL lag might be enough to die from some race with NAT timeouts). Whether that past use was the SSL exchange (server-bump only) or a previous HTTP data packet. I agree this is just as much true on bumped connections which were pinned at some unknown time earlier as it is for connections pulled out of a shared pool and last used some unknown time earlier. Regardless of how the persistence was done they *are* essentially reused idle persistent connections. All the same risks/problems, but whether retry or alternative connection setup is possible differs greatly between the traffic types - with intercepted traffic (of any source) the re-try is more dangerous than informing the client with an aborted connection.


The same logic applies to pinned connection outside SslBump.
Which it quite likely the wrong thing to do. See above.

Regards
Henrik


Amos

Reply via email to