Re: [RFC] bandwidth savigns via header eliding

2014-07-26 Thread Henrik Nordström
fre 2014-07-25 klockan 19:05 +0300 skrev Eliezer Croitoru:

> The response to alex question why would anybody want to drop 
> "cteonnt-length:" header:
> Some places do not allow cookies or POST for external services and it's 
> sometimes can looks weird but I still understand why would it be 
> considered a security hole by some minds.

Dropping mangled connection-length header is not about security. It's no
more than a garbage header carrying no meaning other than an distant
echo of it's original form. It is transformed in this manner to avoid
being read as connection-length while doing a minimal lightweight
rewrite of the TCP/IP payload.

Bandwidth saving from dropping this header will be close to
unmeasurable.

Security impact likewise.

But sure, if you have a whitelist policy of only allowing what is
explicitly allowed then it would be dropped by the catch-all DROP
policy. But no hardwiring in our code is needed for that.

The discussion about mangled Connection header may be more interesting,
but only if there is bugs in the software that mangled the Connection 
header leaving what was intended as hop-by-hop headers unmangled. But I
don't see much of a point of worrying about that until there is
indication that there is problems caused by such headers.

>From the referenced discussion it's quite clear this rewrite practice is
limited to one cache appliance vendor. It is not likely to be
implemented by others.

Regards
Henrik



Re: [RFC] bandwidth savigns via header eliding

2014-07-25 Thread Eliezer Croitoru
I do believe that from a run-time point of view (such as speed and cpu 
load) there is not much of a difference between using squid acls for 
headers and hard-coding a decision.


However I do think that squid users\consumers can vary between 
ISPs\enterprises\homes\smbs\others and any default settings would put 
one of the consumers into a very strange situation if the acls for 
headers will be hard-coded.


The response to alex question why would anybody want to drop 
"cteonnt-length:" header:
Some places do not allow cookies or POST for external services and it's 
sometimes can looks weird but I still understand why would it be 
considered a security hole by some minds.


For most default setup the main concern is caching and not ACLs and 
security policy for data smuglling throw the proxy.


We can try to release a recommended list of acls for a more secure 
environments and less strict for common used headers acls.


Eliezer

On 07/19/2014 09:06 PM, Alex Rousskov wrote:



>or reject requests with cteonnt-length: header in self defense.

Rejecting requests goes even further from the original RFC, but I
believe we have configuration options to do that as well.

Going forward, I suggest the following discussion format:

For each header (or an algorithmically defined group of headers):

a) Why does anybody may want to drop this header?

b) How do we want to tell Squid that the header should be dropped?
Hard-coded decision, existing squid.conf directives, other?

c) What should be the default behavior? Drop or leave "as is"?

d) Any side actions if the header is dropped? For example, should we
also drop the headers the dropped header lists?


HTH,

Alex.





Re: [RFC] bandwidth savigns via header eliding

2014-07-21 Thread Henrik Nordström
lör 2014-07-19 klockan 11:35 -0600 skrev Alex Rousskov:

> The above email is talking about a "nnCoection: close" header which
> appears to be a result of a bug in some 15-year old software.
> Identifying that rare header would be overall harmful -- Squid would
> spend more resources on detecting that header presence than it will save
> by removing that header when it is found.

Indeed a valid question.

HTTP divides headers in two groups. end-to-end and hop-by-hop. Assuming
we intend to operate as a semantically transparent HTTP proxy (meaing
one that do not change the meaning of requests/responses) then we MUST
NOT touch end-to-end headers.

The list of hop-by-hop headers is very short. Anything not in that list
is by definition end-to-end unless mentioned in Connetion.

> There are some gray areas like defense against request smuggling, but
> even there extreme care should be taken to avoid harming valid HTTP
> traffic. It is certainly not the area of "delete all bad headers in the
> parser" solutions.

request/response smuggling is different. request/response smuggling uses
incorrectly/ambigiously formatted HTTP to try to make different agents
parse the data stream differently.

As for bandwidth savings then dropping "considered useless" headers will
do very little on bandwidth savings and risk cause a lot of grief.

Much higher bandwidth savings can be acheived by header compression
using a dictionary, but that requires supporting agents on both sides of
the link.

Regards
Henrik



Re: [RFC] bandwidth savigns via header eliding

2014-07-19 Thread Alex Rousskov
On 07/18/2014 11:37 PM, Amos Jeffries wrote:
> On 19/07/2014 2:55 a.m., Alex Rousskov wrote:
>> On 07/18/2014 01:32 AM, Amos Jeffries wrote:
>>> I have wondered about creating a registry of known garbage and simply
>>> dropping those headers on arrival in the parser. This would be in
>>> addition to the header registry lookup and masking process we have for
>>> hop-by-hop headers.

>> We already have squid.conf options to drop headers. Folks that want to
>> focus on saving bandwidth may use them. We can publish the corresponding
>> configuration excerpts on the wiki.
>>
>> If those options are not enough, let's add more. If those options slow
>> Squid down too much, let's discuss optimizations (keeping in mind that
>> much better optimizations can probably be obtained by preserving header
>> blobs during forwarding).
>>
>> However, please do not hard-code policing of messages Squid can grok,
>> especially in the parser.


> See my post in reply to Eliezer.

Your reply to Eliezer does not address my concern: Why should we avoid
the existing configuration interface that already allows folks to drop
what they want to drop?


> But the connection: and content-length header mangling, and
> some of the other security bypasses have deeper implications and special
> processing may be needed to cleanup properly. 

Special identification code may indeed be needed for mangling cases, but
it can be integrated with the existing configuration nicely. If you want
to add code that would identify "mangled" Connection and Content-Length
headers, that code can be engaged using something like a
Mangled-Header-Name or %Mangled-Header-Names request_header_access
parameter.

We can then discuss whether it is a good idea to use that configuration
by default (which is a separate decision IMO, see further below).


> ie drop a cneonction:
> header and also drop any it lists just to be safe,

This does not sound like a bandwidth savings feature, but Squid can
indeed be taught to drop headers listed by the mangled Connection
header. This behavior can be triggered automatically if Squid is
dropping a mangled Connection header.


> or reject requests with cteonnt-length: header in self defense.

Rejecting requests goes even further from the original RFC, but I
believe we have configuration options to do that as well.

Going forward, I suggest the following discussion format:

For each header (or an algorithmically defined group of headers):

a) Why does anybody may want to drop this header?

b) How do we want to tell Squid that the header should be dropped?
   Hard-coded decision, existing squid.conf directives, other?

c) What should be the default behavior? Drop or leave "as is"?

d) Any side actions if the header is dropped? For example, should we
also drop the headers the dropped header lists?


HTH,

Alex.



Re: [RFC] bandwidth savigns via header eliding

2014-07-19 Thread Alex Rousskov
On 07/18/2014 11:33 PM, Amos Jeffries wrote:
> On 19/07/2014 2:07 a.m., Eliezer Croitoru wrote:
>> This got my eyes but I am not reading all ietf httpbits mails and I
>> would like to get a reference for this thread please?

> There are two type of removable headers:
>  a) headers which exist purely to bypass security
>  b) headers which exist due to intermediaries breaking them
> 
> The post describing why the (b) group occur is here:
>   http://lists.w3.org/Archives/Public/ietf-http-wg/2014JulSep/0132.html

The above email is talking about a "nnCoection: close" header which
appears to be a result of a bug in some 15-year old software.
Identifying that rare header would be overall harmful -- Squid would
spend more resources on detecting that header presence than it will save
by removing that header when it is found.


> One of the posts which is making me think we could benefit from doing
> something is:
>   http://lists.w3.org/Archives/Public/ietf-http-wg/2014JulSep/1220.html

> This lists the existing headers found in the data sets being analysed by
> IETF as representative of HTTP web traffic.

> What I can see in that listing is the following headers (by type above).
> 
> (A) group:
> 
>  x-powered-by / x-aspnet-version / x-aspnetmvc-version / x-pb-mii -
> exists to bypass server security measures applied on Server: header.

Sounds like those headers exist to implement some above-HTTP
functionality deemed useful to those who send and/or receive them. What
will Squid break by removing those valid HTTP headers? Why breaking that
functionality is a universally good thing justifying being the default
behavior in an HTTP proxy?


>  x-served-by - same as X-powered-by but also crossing over to contain
> X-forwarded-for: and Forwarded: header contents (but without the
> security protections applied for them).
> 
>  x-host / x-forwarded-host - exist to bypass Browser same-origin
> security measures.
> 
>  x-li-uuid - tracking cookie created to bypass Cookie header security
> and legislative restrictions.
> 
>  x-fs-uuid - header for distributing the UUID of the server hard drive
> out to the public network (seriously, what could go wrong with that huh?)
> 
>  x-radid - seems to be another disk drive tracking ID method.

Same questions apply here. Please correct me if I am wrong, but it
sounds like you are dividing HTTP-compliant agents into two categories:
Those that use HTTP the way you want HTTP to be used and all others. The
division appears to be based not on some HTTP MUSTs, but your view of
which "security" model must be defended.

IMO, Squid should strive to support all HTTP-compliant agents by
default. We should not be the internet police because policing traffic
requires making judgments of who is the "bad" guy, which is outside of
software developers competence. Folks that want to enforce a particular
security model may propose optional features and configuration excerpts
that do so, of course.

There are some gray areas like defense against request smuggling, but
even there extreme care should be taken to avoid harming valid HTTP
traffic. It is certainly not the area of "delete all bad headers in the
parser" solutions.


> (b) group worry me for the reasons given below:
> 
>  nncoection / cneonction / x-cnection - reason described in the above
> email. I am a little bit worried that in HTTP/1.1 these may have
> actually contained lists of headers which were to be dropped by the
> earlier intermediary. But obscuring the "Connection:" name we are
> potentially transmitting headers like Upgrade: or with private details
> that should be elided.

I do not see why honoring _and_ then dropping what we think is a former
Connection header helps more than it hurts (by default). In fact, that
sounds like a useful smuggling attack vector to me -- "we know Squid
will drop these headers but others will pass them on, so let's use that
for our evil needs".


>  ntcoent-length / cteonnt-length - Given the reason behind 16-bit rotate
> on header name any of the mandatory HTTP/1.1->1.0 and connection:close
> addition required to make this safe will alter the checksum. So will
> content adaptation if that was the point.

I do not understand how header changes affect content checksums. Those
checksums do not include headers.


> I am left with assuming that this is done to smuggle messages in a
> pipeline through the receiving server as a single request/reply.

Your assumption seems to contradict what we know for a fact is going on
in many (probably most!) cases of such header name adaptations --
converting standard header names into extension header names to avoid
buffer copies.



> There are also a bunch of other headers which can best be called
> "garbage". Relatively harmless though.
> 
> Old HTTP features and mechanisms which are now not supposed to be sent:
> 
>  pragma:close - dead HTTP/1.0 feature. Not to be emitted by HTTP/1.1
> software.
>  p3p- dead standard, removed from service due to privacy violatio

Re: [RFC] bandwidth savigns via header eliding

2014-07-18 Thread Amos Jeffries
On 19/07/2014 2:55 a.m., Alex Rousskov wrote:
> On 07/18/2014 01:32 AM, Amos Jeffries wrote:
>> Some of the statisticas being brought up in the IETF HTTP/2 discussions
>> is highlighting certain garbage headers which are unfortunately quite
>> common.
> 
> I join Eliezer in begging for pointers to relevant posts or pages.
> 
> 
>> I have wondered about creating a registry of known garbage and simply
>> dropping those headers on arrival in the parser. This would be in
>> addition to the header registry lookup and masking process we have for
>> hop-by-hop headers.
>>
>> Any other thoughts on this?
> 
> We already have squid.conf options to drop headers. Folks that want to
> focus on saving bandwidth may use them. We can publish the corresponding
> configuration excerpts on the wiki.
> 
> If those options are not enough, let's add more. If those options slow
> Squid down too much, let's discuss optimizations (keeping in mind that
> much better optimizations can probably be obtained by preserving header
> blobs during forwarding).
> 
> However, please do not hard-code policing of messages Squid can grok,
> especially in the parser.


See my post in reply to Eliezer. the general garbage ones we could leave
to admin. But the connection: and content-length header mangling, and
some of the other security bypasses have deeper implications and special
processing may be needed to cleanup properly. ie drop a cneonction:
header and also drop any it lists just to be safe, or reject requests
with cteonnt-length: header in self defense.

Amos


Re: [RFC] bandwidth savigns via header eliding

2014-07-18 Thread Amos Jeffries
On 19/07/2014 2:07 a.m., Eliezer Croitoru wrote:
> This got my eyes but I am not reading all ietf httpbits mails and I
> would like to get a reference for this thread please?
> 

There are two type of removable headers:
 a) headers which exist purely to bypass security
 b) headers which exist due to intermediaries breaking them

The post describing why the (b) group occur is here:
  http://lists.w3.org/Archives/Public/ietf-http-wg/2014JulSep/0132.html

One of the posts which is making me think we could benefit from doing
something is:
  http://lists.w3.org/Archives/Public/ietf-http-wg/2014JulSep/1220.html

This lists the existing headers found in the data sets being analysed by
IETF as representative of HTTP web traffic. ie HTTP/2 compression an
dimprovements measured

What I can see in that listing is the following headers (by type above).

(A) group:

 x-powered-by / x-aspnet-version / x-aspnetmvc-version / x-pb-mii -
exists to bypass server security measures applied on Server: header.

 x-served-by - same as X-powered-by but also crossing over to contain
X-forwarded-for: and Forwarded: header contents (but without the
security protections applied for them).

 x-host / x-forwarded-host - exist to bypass Browser same-origin
security measures.

 x-li-uuid - tracking cookie created to bypass Cookie header security
and legislative restrictions.

 x-fs-uuid - header for distributing the UUID of the server hard drive
out to the public network (seriously, what could go wrong with that huh?)

 x-radid - seems to be another disk drive tracking ID method.


(b) group worry me for the reasons given below:

 nncoection / cneonction / x-cnection - reason described in the above
email. I am a little bit worried that in HTTP/1.1 these may have
actually contained lists of headers which were to be dropped by the
earlier intermediary. But obscuring the "Connection:" name we are
potentially transmitting headers like Upgrade: or with private details
that should be elided.

 ntcoent-length / cteonnt-length - Given the reason behind 16-bit rotate
on header name any of the mandatory HTTP/1.1->1.0 and connection:close
addition required to make this safe will alter the checksum. So will
content adaptation if that was the point.
  I am left with assuming that this is done to smuggle messages in a
pipeline through the receiving server as a single request/reply.



There are also a bunch of other headers which can best be called
"garbage". Relatively harmless though.

Old HTTP features and mechanisms which are now not supposed to be sent:

 pragma:close - dead HTTP/1.0 feature. Not to be emitted by HTTP/1.1
software.
 p3p- dead standard, removed from service due to privacy violations.
 x-pad  - supposedly an HTTPS-only feature for "fixing"
 proxy-connection - dead non-standard. we already drop this one


debug headers that are mostly useless (we could help clean this up by
only enabling our x-cache headers based on a debug config option)

 x-cache / x-cache-lookup / x-cache-action / x-cache-hits / x-cache-age
/ x-fb-debug / x-mii-cache-hit / bk-server


Amos

> Thanks,
> Eliezer
> 
> On 07/18/2014 10:32 AM, Amos Jeffries wrote:
>> Some of the statisticas being brought up in the IETF HTTP/2 discussions
>> is highlighting certain garbage headers which are unfortunately quite
>> common.
>>
>> I have wondered about creating a registry of known garbage and simply
>> dropping those headers on arrival in the parser. This would be in
>> addition to the header registry lookup and masking process we have for
>> hop-by-hop headers.
>>
>> Any other thoughts on this?
>>
>> Amos
> 



Re: [RFC] bandwidth savigns via header eliding

2014-07-18 Thread Alex Rousskov
On 07/18/2014 01:32 AM, Amos Jeffries wrote:
> Some of the statisticas being brought up in the IETF HTTP/2 discussions
> is highlighting certain garbage headers which are unfortunately quite
> common.

I join Eliezer in begging for pointers to relevant posts or pages.


> I have wondered about creating a registry of known garbage and simply
> dropping those headers on arrival in the parser. This would be in
> addition to the header registry lookup and masking process we have for
> hop-by-hop headers.
> 
> Any other thoughts on this?

We already have squid.conf options to drop headers. Folks that want to
focus on saving bandwidth may use them. We can publish the corresponding
configuration excerpts on the wiki.

If those options are not enough, let's add more. If those options slow
Squid down too much, let's discuss optimizations (keeping in mind that
much better optimizations can probably be obtained by preserving header
blobs during forwarding).

However, please do not hard-code policing of messages Squid can grok,
especially in the parser.


Thank you,

Alex.



Re: [RFC] bandwidth savigns via header eliding

2014-07-18 Thread Eliezer Croitoru
This got my eyes but I am not reading all ietf httpbits mails and I 
would like to get a reference for this thread please?


Thanks,
Eliezer

On 07/18/2014 10:32 AM, Amos Jeffries wrote:

Some of the statisticas being brought up in the IETF HTTP/2 discussions
is highlighting certain garbage headers which are unfortunately quite
common.

I have wondered about creating a registry of known garbage and simply
dropping those headers on arrival in the parser. This would be in
addition to the header registry lookup and masking process we have for
hop-by-hop headers.

Any other thoughts on this?

Amos