On 17/04/2013 9:58 p.m., anita wrote:
Hi Amos,
I realise this could be a development related post. I will repost it there.
Sorry for the inconvenience.
I am raising this request internally in Squid code to fetch some urls that
is present in the already replied object.
Say I am using a client to request fetchcompress.html (without any encoding
set). The Squid fetches this fetchcompress.html from the origin
server(apache) and returns it to the client.
At the same time, it parses the fetchcompress.html to see if it has any
prefetchable urls.
In my case, fetchcompress.html has a prefetchable link compress.html.
To fetch this, I setup a fake request header with "Accept-Encoding: gzip" in
it. This is done internally by the squid code itself. I believe this is
successfully done as I can see it in the tcpdump (refer to "Request sent to
Apache (tcpdump)" section in my prev post).
When I retrieve this object using a StoreClientCopy(), it gives me an empty
object ie. Object length was 0.
a) Now did this happen because I simply retrieved the object based on the
url alone?
b) Why is the Content-Encoding tag absent from the reply header?
Aha. The answer then is *because* you fetched the object from cache via
URL alone. This URL points at a variant resource. The Squid cache entry
for the URL alone points at that internal/vary marker object. The real
object is stored at hash location built from URL plus the
Accept-Encoding header text "gzip".
You need to handle these vary marker objects as a tool to determine that:
a) the pre-fetcher needs to make several lookups for this URL, and
b) the marker objects Vary: header to decide what permutations of
which headers the prefetcher should use on its followup fetches.
Here is a little sequence of client transactions and what to expect in
the store contents state for your edification:
For simplicity the start state is a new URL with no stored contents.
1) your prefetch pulls using Accept-Encoding:gzip.
Store determines MISS on the URL.
After the server responds Squid store contains
+ HASH(URL) --> vary marker
+ HASH(URL+"gzip") --> server gzipped response
2) client fetches URL using "Accept-Encoding:gzip,deflate"
Store loads the vary marker, determines MISS on the URL+"gzip,deflate"
After the server responds Squid store contains
+ HASH(URL) --> vary marker
+ HASH(URL+"gzip") --> server gzipped response
+ HASH(URL+"gzip,deflate") --> server gzipped response
3) client fetches URL using "Accept-Encoding:deflate"
Store loads the vary marker, determines MISS on the URL+"deflate"
After the server responds Squid store contains
+ HASH(URL) --> vary marker
+ HASH(URL+"gzip") --> server gzip encoded response
+ HASH(URL+"gzip,deflate") --> server gzip encoded response
+ HASH(URL+"deflate") --> server deflate encoded response
4) client fetches URL using "Accept-Encoding:sdch"
Store loads the vary marker, determines MISS on the URL+"sdch"
After the server responds Squid store contains
+ HASH(URL) --> vary marker
+ HASH(URL+"gzip") --> server gzip encoded response
+ HASH(URL+"gzip,deflate") --> server gzip encoded response
+ HASH(URL+"deflate") --> server deflate encoded response
+ HASH(URL+"sdch") --> server sdch encoded response
Repeat for all possible combinations of all encodng types (including
whitespace padding permutations). With an occasional HIT when a client
repeats the Accept-Encoding header.
This is why prefetching is not very popular. You have to prefetch at
least 4 variants of the encoding header just to cover the most popular
browsers - I will leave it to you to figure out what those are.
HTH
Amos