Re: [squid-dev] [RFC] simplifying ssl_bump complexity

Marcus Kool Sun, 20 Nov 2016 04:56:39 -0800


On 11/20/2016 12:06 AM, Amos Jeffries wrote:

On 20/11/2016 12:08 p.m., Marcus Kool wrote:

[snip]


I like the intent of the proposal and the new directives tls_*.
What currently makes configuration in Squid 3/4 difficult is
the logic of 'define in step x what to do in the next step' and
IMO this logic is the main cause of misunderstandings and
incorrect configurations.  Also the terms 'bump' and 'splice'
do not help ease of understanding.  Since Squid evolved and
bumping changed from 3.3 - 3.4 - 3.5 to 4.x, and likely will
change again in 5.x, there is an opportunity to improve
things more than is proposed.
There is also a difference in dealing with transparent intercepted
connections and direct connections (browsers doing a CONNECT)
which also causes some misunderstandings.
The current ssl bump steps allow problematic configs where Squid
bumps or stares in one step and to splice in an other step,
which can be resolved (made impossible) in a new configuration syntax.

I propose to use a new logic for the configuration directives
where 'define in step x what to do in the next step' is replaced
with a new logic 'define in step x what to do _now_'.


From reading the below I think you are mistaking what "now" means to
Squid. Input access control directives in squid.conf make a decision
about what action to do based on some state that just arrived.


Maybe it is necessary to redefine 'now' but my point remains that
'define in step x what to do in the next step' is the cause of
most misunderstandings.

For example:
 HTTP message just finished parsing -> check http_access what to do with it.
 HTTP reply message just arrived -> check http_reply_access what to do
with it.

Thus my proposal was along the lines of:
  client hello recieved -> check tls_client_hello what to do with it.
  server hello recieved -> check tls_server_hello what to do with it.


For both hello messages: is the decision moment the moment where it
has been peeked at?


Below is a new proposal to attempt to make the configuration
more intuitive and less prone for admin misunderstandings.

First the admin must define if there is any bumping at all.
This could be done with
https_decryption on|off
This is similar to tls_new_connection peek|splice but much
more intuitive.

Iff https_decryption is on:

1) the "connection" step:
When a browser uses "CONNECT <FQDN>" Squid does not need to make
peek or splice decisions.
When Squid intercepts a connection to "port 443 of <IP>" no peek
or splice decision is made here any more.
This step becomes obsolete in the proposed configuration.


I am still hoping that will happen. But also still getting pushback that
people want to terminate or splice without even looking at the
clear-text hello details.


We must know the reasons behind this pushback.  Only then sane decisions
can be made.


2) the "TLS client hello" step:
When a browser uses CONNECT, Squid has a FQDN and does not need
peeking a TLS client hello message. It can use the tls_client_hello
directives given below.


Sadly this is not correct. Squid still needs to get the client hello
details at this point. They are needed to perform bump before the server
hello is received, and to "terminate with an error message" without
contacting a server.


yes, correct.  Squid must do this.  But does it have to be configured?

When Squid intercepts a connection, Squid always peeks to retrieve
the SNI which is the equivalent of the FQDN used by a CONNECT.
In this step admins may want to define what Squid must do, e.g.
tls_client_hello passthrough aclfoo
Note that the acl 'aclfoo' can use tls::client_servername and
tls::client_servername should always have a FQDN if the connection
is https.  tls::client_servername expands to the IP address if
the SNI of an intercepted connection could not be retrieved.


What if the SNI contradicts the CONNECT message FQDN ?
What if a raw-IP in the CONNECT message (or TCP SYN) does not belong to
the server named in SNI ?


:-)  I left this out on purpose to not make the post even larger than it was.
There is of course a lot of error checking.  The question is if
we have to configure it.  If yes, can we get away with one directive based
on an acl that uses tls::handshake_failure ?

Squid would now be diverting the client transparently to a server other
than the one it expects and caching under that FQDN. But the server cert
would still authenticate as being the SNI host, so TLS cannot detect the
diversion.

The fake CONNECT's are a bit messy but IMHO we can only get rid of the
first one done for intercepted connections. Although that alone would
make both cases handle the same way.


I do not know anything about the code that generates the fake CONNECT
of an transparent interception connection, but logically there should
not be a fake CONNECT for true HTTPS (TLS+HTTP) since a browser does
not do a CONNECT, so why fake one?  Was the fake CONNECT introduced
to keep a state?  If yes, the state can be administered in a different way.

However, in case of other protocols on port 443 a fake CONNECT is a
must to be able to filter and log it.

For https connections with a client hello without the SNI extension:
tls_client_hello passthrough|terminate aclbar
where aclbar can contain tls::client_hello_missing_sni

For connections that do not use TLS (i.e. not a valid
TLS client hello message was seen):
tls_client_hello passthrough|terminate aclbar2
where aclbar2 may contain tls::handshake_failure

To define that the TLS handshake continues, the config can contain
tls_client_hello continue
This is a basically a no-op and not required but enhances readability
of a configuration.


This is getting complicated again. Avoiding the first (TCP SYN) access
control cycles will make ssl::server_name ACL contain either FQDN/SNI or
the actual server cert name. A default "none" or "-" value when no
FQDN/SNI is available should be sufficient for matching these clients.
And we should have that already I think, if not its a missing feature bug.


Maybe in some cases tls_new_connection is required to be used, but
lets start with a default of
tls_new_connection continue
Under normal circumstances there is no need to define that Squid must
peek or already define that the connection must be spliced.

In my proposal there is no ssl::server_name and I did that on purpose.
While it might seem a nice idea to evaluate it from 'none' to SNI
to server name, there is no need for it and the use of tls::client_servername
and tls::server_servername explicitly refers to which servername
Squid must look at.
And at the moment that tls_new_connection is evaluated, there
is no servername when intercepted.

Also even "no-op" ACLs still require CPU time to assemble data for ACL
checking. So avoiding that kind of thing in configs is good.


Maybe not, the no-op was there for readability and at config parse time
Squid can simply skip it to not have any CPU cycles wasted on it
by acl matching logic.

I am not familiar with all steps/stages in the implementation but
one may get better performance in a simpler design if that reduces
the number of events that Squid processes for each TLS connection.

3) the "TLS server hello" step:
Usually no directives are needed since rarely actions are taken
based on the server hello message, so the default is
tls_server_hello continue


Once Squid gets to this stage of processing "continue" could mean either
splice or bump, but only one of the two is possible.

Bump is not legal in a lot of situations but splice may not be possible.
So terminate is the only realistic default we can set for general use.
As you say, its uncommon to need it, and people who do tend to know what
they want done. So will configure that.


Admin who want https decryption do not want that the connection is
terminated by default.

You are right that tls_server_hello needs a little more thought.
Somewhere must be defined if the connection will have mode passthrough
(splice) or mode decrypt (bump).
And the right place to make that decision is at tls_client_hello
(after the SNI has been peeked at).
Then for tls_server_hello remains 'continue' and 'terminate' decisions.

The handshake containing error signals or other protocols are different
decisions points. Remember Squid is deciding what to do with some data
that just arrived. If its going to tunnel/splice the server hello, there
is no point even evaluating what the protocol inside is, and error
alerts get delivered to the client as-is.

So the existing on_unsupported_protocol and sslproxy_cert_error to make
those handling decisions with 'ssl_error' type ACL to match the specific
problem are fine IMO.


yes, they look fine but may need a rename to have a tls_ prefix
but I cannot find a way to define
continue when cert-is-self-signed and server_name is '.nic.es' and
otherwise terminate when cert-is-self-signed.

The tls_server_hello can be used to terminate specific connections.
In this step many types of certificate errors can be detected
and in the Squid configuration there must be a way to define
what to do for specific errors and optionally for which FQDN.
E.g. allow to define that connections with self-signed certificates
are terminates but the self-signed cert for domain foo.example.com
is allowed.  See also the example config below and the use of
tls::server_servername.

What is left, is a configuration directive for connections
that use TLS as an encryption wrapper but do not use HTTP
inside the TLS wrapper:
tls_no_http passthrough|terminate   # similar to on_unsupported_protocol

An example configuration looks like this:
https_decryption on
acl banks tls::client_servername .bank1.example.org
acl no_sni tls::client_hello_missing_sni
acl no_handshake tls::handshake_failure
acl hacked_server tls::server_servername evil.example.com
tls_client_hello passthrough banks
tls_client_hello terminate no_sni
tls_client_hello passthrough no_handshake
tls_client_hello continue
tls_server_hello terminate hacked_server
tls_server_hello continue
tls_no_http passthrough

The configuration directives as I proposed are IMO intuitive and
leave very little room for misunderstandings.


Looks just like what I'm proposing except your config above lets clients
connect to servers with invalid TLS credentials (maybe not so intuitive?).


Are you referring to 'tls_client_hello passthrough no_handshake' ?
That should be 'tls_client_hello terminate no_handshake'.

 acl banks ssl::server_name .bank1.example.org
 acl no_sni ssl::server_name "-"
 acl hacked_server ssl::server_name evil.example.com

 tls_client_hello splice banks
 tls_client_hello terminate no_sni
 tls_client_hello terminate hacked_server
 tls_client_hello peek all

 tls_server_hello terminate hacked_server
 tls_server_hello splice all

 # NP: "sslproxy_cert_error deny all" is default

 on_unsupported_protocol tunnel all


In my example at the time that tls_client_hello is evaluated,
the client hello has not only arrived but has also been parsed
and the SNI is known.  You also seem to intend that given
the rule 'tls_client_hello splice banks', but then why do you
have the rule 'tls_client_hello peek all' ?  Does that mean
'peek at the received client hello and continue' ?

The 'tls_client_hello continue' was intended to mean 'decrypt
the connection' (bump) but I failed to mention that.  I think
that 'tls_client_hello continue' must be replaced with
'tls_client_hello decrypt'.

There is a distinction between tls::server_servername and
tls::client_servername on one hand and ssl:server_name on the
other hand.  I prefer tls::* since it explicitly refers to
one of the servernames while ssl::server_name refers to
a dynamic object that needs evaluation and may evaluate to
different server names at different times.

One of the rules with 'terminate hacked_server' seems
redundant.

So after corrections, my example becomes
https_decryption on
acl banks tls::client_servername .bank1.example.org
acl no_sni tls::client_hello_missing_sni
acl no_handshake tls::handshake_failure
acl hacked_server tls::server_servername evil.example.com
tls_client_hello passthrough banks
tls_client_hello terminate no_sni
# tls_client_hello terminate no_handshake  # default behavior, so no rule needed
tls_client_hello decrypt
tls_server_hello terminate hacked_server
tls_server_hello continue
tls_unsupported_protocol passthrough all

In your example the meaning is to splice all connections and
with my proposed syntax the config would look like this:

https_decryption on
acl banks tls::client_servername .bank1.example.org
acl no_sni tls::client_hello_missing_sni
acl no_handshake tls::handshake_failure
acl hacked_server tls::server_servername evil.example.com
tls_client_hello passthrough banks
tls_client_hello terminate no_sni
tls_client_hello passthrough
tls_server_hello terminate hacked_server
tls_server_hello continue
tls_unsupported_protocol passthrough all

The choice of keywords for the config may seem a matter of taste,
but I feel it is more than that and that the keyword
'decrypt' is far better than 'bump' and the keyword
'passthrough' is better than 'splice'.

Marcus

Amos

_______________________________________________
squid-dev mailing list
[email protected]
http://lists.squid-cache.org/listinfo/squid-dev

_______________________________________________
squid-dev mailing list
[email protected]
http://lists.squid-cache.org/listinfo/squid-dev

Re: [squid-dev] [RFC] simplifying ssl_bump complexity

Reply via email to