Let me preface ALL the remarks below with TWO statements...

1. I haven't done any research on these HTTP based Client/Server compression 
topics in quite some time. It is all, essentially, 'ancient history' for me 
but it still amazes me that some of the issues are, so many years later,
still being 'guessed about' and no one has all the answers.

2. Let's not lose the TOPIC of this thread. The thread is about whether
or not it's time to just turn mod_deflate ON by default in the 'safe'
httpd.conf that ships with Apache. Regardless of disagreement on some
of the internals and the remaining 'questions' I think it's clear to 
some now that this is NOT just an 'easy decision'. It's complicated.
It WILL cause 'problems' for some installations and some environments.

> Bryan McQuade wrote...
>
>> Kevin Kiley wrote...
>> Don't forget the ongoing issue that if you ONLY vary on 'Accept-Encoding'
>> then almost ALL browsers will then refuse to cache a response entity LOCALLY
>> and the pain factor moves directly to the Proxy/Content Server(s).
>
> I don't think this is true for browsers in use today. 

Well, it's certainly true of the MSIE 6 I have 'in use today' on almost
all of my Windows XP Virtual Machines that I use for testing.

Also 'I don't think this is true' is certainly not 'I'm SURE this is not true'.

> Firefox will certainly cache responses with Vary: Accept-Encoding. 

I haven't done any testing with Firefox.

Firefox wasn't even around when this first became an issue years ago.

I'll take your word for it unless/until you can provide some Firefox
specific test results page(s)that prove this to be true.

The Mozilla/Firefox family has ALWAYS had a 'different' approach to
how the client-end decompression gets done. That browser lineage chose
to always use the local 'cache' as the place where the decompression
takes place. That's why, when you use those browsers and you receive
a compressed entity, you always get TWO cache files. One is simply the
browser opening up a cache file to store the COMPRESSED version of the
entity and the other is the DECOMPRESSED version. This is true even if 
the response is labeled 'Cache-control: private'. It will STILL 'cache' the
response and ignore all 'Cache-Control:' directives but that's another
'bug' story altogether. They are simply doing all the decompression 'on disk' 
and using plain old file-based GUNZIP to get the job done so there HAS to be
a 'cached copy' of the response regardless of any 'Cache-Control:' directives. 
It will also keep BOTH copies of the file around. The MSIE browser line
will also use a cache-file ( sometimes ) for the decompression but, unlike the
Mozilla lineage, MSIE will DELETE the initial compressed copy to avoid 
confusion.
There used to be this weird-ass bug with Mozilla that would only
show up if you tried to PRINT a decompressed page. The browser would
forget that it had TWO disk files representing the compressed/uncompressed
response and it would accidentally try to PRINT the COMPRESSED version.
I certainly hope the Firefox branch of this source code line worked 
that little bug out.

> Eric Lawrence of
> the Internet Explorer team has a nice blog post that explains that in
> IE6 and later, Vary: Accept-Encoding are cached:
> http://blogs.msdn.com/b/ieinternals/archive/2009/06/17/vary-header-prevents-caching-in-ie.aspx.

The 'EricLaw' MSDN Blog link you mention only says this about MSIE 6...

[snip]

Internet Explorer 6

Internet Explorer 6 will treat a response with a Vary header as completely 
uncacheable, 
unless the Vary header contains only the token User-Agent or Accept-Encoding.  
Hence, 
a subsequent request will be made unconditionally, resulting in a full 
re-delivery of 
the unchanged response.

This results in a significant performance problem when Internet Explorer 6 
encounters Vary headers.

[/snip]

This does NOT match my own research with MSIE 6 so the guy must be talking 
about some very specific BUILD version or 'hotpatched' version of MSIE 6.

In case you missed it... here is the link to one of the discussions
about this on the Apache mod_gzip forum which contains complete
test results obtained using a kernel debugger with MSIE 4, 5 and 6...

http://marc.info/?l=apache-modgzip&m=103958533520502&w=2

Eric also fails to mention that if you include MULTIPLE
Vary: headers and/or multiple conditionals on the
same 'Vary:' line ( as RFC 2616 says you are supposed to be able to do ) 
then MSIE 6 stops caching and treats those inbounds as the infamous 'Vary: *'.
( Vary: STAR ) I believe that last part is STILL TRUE even to this day which 
means it is STILL 'Non-RFC compliant'. This 'other issue' was also covered in 
my own MSIE 6 testing at the links above.

> Other variants of Vary do prevent caching in IE but Vary:
> Accept-Encoding is safe.

According to EricLaw, yes... but see links and test results above.

That is NOT what my own MSIE 6 testing showed.

My testing on MSIE 6 only showed that ANY presence of ANY 'Vary:'
header OTHER than 'User-Agent' would cause that browser to treat 
the inbound as if it had 'Vary: *' ( Vary: STAR ) and would then REFUSE 
to cache the reponse locally. It also even mattered that the 'User-Agent'
designation be formatted 'perfectly' ( according to them ) or it 
wouldn't even cache that 'Vary: User-Agent' response.

> What browsers are you thinking of that will refuse to cache a response
> with Vary: Accept-Encoding?

See above. My own testing focused on MSIE 4, 5, 6 ( MSIE 7 wasn't out at
that time ) but OTHERS on the mod_gzip forum thread were the ones reporting
similar familiars with the OTHER browser like Mozilla and Opera and Safari.
That's what that entire thread was focused on. WHICH browser will 'do the
right thing' and which ones won't.

>> The OTHER 'ongoing issue' regarding compression is that, to this day,
>> it still ONLY works for a limited set of MIME types. The 'Accept-Encoding:
>> gzip,deflate'
>> header coming from ALL major browser is still mostly a LIE. It would seem
>> to indicate that the MIME type doesn't matter and it will 'decode' for ANY
>> MIME type but nothing could be further from the truth. There is no browser
>> on the
>> planet that will 'Accept-Encoding' for ANY/ALL mime type(s).
>
> I don't buy this argument.

I'm not sure it matters whether you 'buy the argument' or not.
I wasn't really trying to 'sell' anything at all and I wasn't 
presenting it as an 'argument'. I was trying to report what I believe 
is an ongoing concern for ANYONE who wants to turn on Web Site 
compression. Hopefully this can all one day come down to just pure
empirical evidence and not remain in the 'I buy/don't buy the argument'
realm. 

I don't have all the 'answers' and neither do you. 
The question at hand is... does anyone?

> Browsers support content decoding for all the common mime types that 
> benefit from content decoding. 

That's a pretty sweeping statement. So broad, in fact, that I am afraid
I need to ask YOU to supply some proof that what you are saying there is TRUE.

Things are probably a little better now than they were a few years ago
but I think you will find your statement is still not 100 percent correct.

Sometimes it actually depends on WHEN the content loading takes place.
Example: Used to be with Mozilla/MSIE that you MIGHT get good decoding on
some appplication/x-javascript stuff if the 'includes' were all in the HEAD 
section of the response document but for any <link/include whatever.js> 
statements 
down in the BODY the decode would NOT take place since the document rendering 
had already begun. I hope that has all changed but I, myself, don't have any 
recent test results for that. Ditto for CSS. Sometimes it simply mattered
whether the 'includes' were in the HEAD section ( or not ). That all has to
do with WHEN the browser is 'ready' to 'pass off' to a content decoder and
when it is NOT.

> It is true that IE6 has bugs but they have been hotfixed. 
> See http://support.microsoft.com/kb/823386 for an example. This was
> hotfixed in 2003. 

The 'hotfix' you are referring to has nothing to do with adherence to
RFC standards. It was a 'nuts and bolts' bugfix that had to do with 
an ongoing URLMON buffering issue. It MAY also have the 'fix' for 
'Vary:' you described earlier but I didn't see any mention of it.

> I'm not aware of any browser compatibility issues
> with text/* mime types for browsers released after IE6. 

I, personally, have done no real-world testing with MSIE 7.
I don't know how much they may have fixed or not fixed.

Anyone ELSE know 'for sure' about that browser?

> No doubt there are some IE6 users without this hotfix that are impacted 
> by this issue. 

See above. The hotfix you are referring to is certainly needed since it
fixes the URLMON low-level buffering issue when receiving compressed data... 
but it is unclear whether this hotfix adds better 'Vary:' support to MSIE 6.

> If there are specific concerns about IE6 users, we can
> blacklist that user agent from getting gzipped responses in the
> default configuration. 

That's still a WHOLE BUNCH of folks. More than you might think.

> Let's not disable gzip compression for the other 90+% of users 
> that have solid support just because one old browser has spotty 
> but hotfixed support for it.

Again... I have no reason to doubt your numbers... but can
you show some PROOF that '90+' percent of all Browsers in
use today have FULL support for content-decoding under ALL 
circumstances?

I would think even the Apache boys themselves would at least be open
to making the changes if someone could just supply some REAL PROOF
and some REAL NUMBERS instead of all these GUESSES.

Yours
Kevin Kiley




Reply via email to