On 07/08/17 08:11, lucas.alv...@laposte.net wrote:

 >> adaptation_meta X-SNI "%ssl::>sni" all   #or connect
 >> #request_header_add X-SNI "%ssl::>sni" all
 >> "
 >> So i want to create an icap service like squidclamav but it must check
 >> SNI not URLs.
 >Any particular reason why?
 > SNI has almost nothing to do with the HTTP messages (plural). It is
 > simply the name of the next-hop server (or proxy) they should be
 > delivered to on their way around the web.
 >I thought squidclamav was an antivirus, not a URL blocklist checker.
You're right: squidclamav is an antivirus but there are much more services, actually he can check url and match them to blacklist or whitelist. I don't want to decrypt https trafic but i want to know where the client is trying to connect. I thought SNI was the only way to know the server name and the domain without decrypting anything.

Sort of yes, and sort of no. SNI is the name of the server the client wants to connect to. But it is not necessarily of any relation to the HTTPS message URL-domain. It could be any of the many names each server has pointing to it. eg. a private/internal hostname or a virtual-host domain. The HTTPS message may go to that same name, or to any of the servers other ones. With HTTP/2 becoming more popular the Alt-Svc / ALTSVC feature is getting more traction. Where the SNI can be expected to contain the alternative servers name and the HTTPS message URL has the exact domain/URL wanted from that server.

A slightly more accurate value is the ServerHello cert SubjectAltName field which lists the names the server is publicly advertising itself to be. That is also available without decrypting using a peek/stare at step2 of SSL-Bump. BUT, that field is more accurate because it can and often does contain a whole list of the servers various names including wildcard sub-domains - which reflects the reality of what a "site" actually looks like. Despite many of us humans thinking a site/domain is a singular thing, it is actually a messy collection of pieces.

These details are all part of why ssl::server_name exists separate from the more familiar dstdomain ACL type.

IMHO if you want your service to cope well with virtual hosting etc sending it both the SNI and the full SubjectAltName set of values would be best. Then it can decide whether any of those details is needing a block or safe to allow.

Final goal is to blacklist for exemple google and when sni indicates www.google.com, c-icap denies the access.

 >> I peek all the steps to get sni and in the squid access log, sni is
 >> printed .
 >> I read that adaptation_meta can send anything from squid to icap but
 >> clearly i use it incorretly: i can't see sni on icap access log or in
 >> icap headers.
 > Your usage appears to be correct. I think there is no SNI being received
 > by Squid.

That's problematic because in my squid access log there are "www.youtube.com" "www.google.com", that's exactly what i'm tryng to pass to c-icap. Seems like squid receives the sni.

FWIW; Squid gets the values from:
 a) CONNECT tunnel request-target, or
 b) SNI, or
 c) server cert SubjectAltName, or
 d) decrypted HTTPS message URL, or
 e) reverse-DNS of the TCP dst-IP address.
In that order AFAIK. So if any of the non-SNI details becomes available Squid can log a name.

That said I do think you may be hitting a bug in Squid SNI handling, its not perfect yet, particularly in the Squid-3.x code. So a traffic analysis with wireshake or similar would be useful at this point to check and confirm whether SNI is given or one of those others happening.

squid-users mailing list

Reply via email to