Package: wget
Version: 1.14-5
Severity: minor

I observed that wget downloads "wrong" content from a squid cache. Both
tools operate within the bounds of RFC2616, but the resulting
combination is wrong. Consider the following interaction:

A http client that supports Accept-Encoding and Content-Encoding
downloads an object via a squid3 proxy. An encoding that is not the
identity encoding is chosen and the response is considered cacheable by
squid3. At a later time wget is instructed to retrieve the same object
via the same proxy. Unlike the other client wget does not send an
Accept-Encoding header. The squid3 proxy exercises an assumption
explicitly granted by RFC2616 14.3:

| If no Accept-Encoding field is present in a request, the server MAY
| assume that the client will accept any content coding.

It thus sends back the cached object including the Content-Encoding
header, but wget does not handle the header and stores the object in
compressed form.

Note that RFC2616 goes on to say:

| In this case, if "identity" is one of the available content-codings,
| then the server SHOULD use the "identity" content-coding, unless it has
| additional information that a different content-coding is meaningful to
| the client.

So arguably squid3 is violating a SHOULD clause and should be fixed.

A simple mechanism to resolve the situation is to make wget send an
empty Accept-Encoding header by default or setting its value to
"identity".

Helmut


-- 
To UNSUBSCRIBE, email to [email protected]
with a subject of "unsubscribe". Trouble? Contact [email protected]

Reply via email to