On Thu, Apr 4, 2019 at 2:23 PM Ryan Schmidt <[email protected]> wrote:
> > On Apr 4, 2019, at 11:45, Dave Allured wrote: > > > That *is* relevant. My Mac is behind an institutional firewall, so this > might be aggravating the problem. Also I am not able to duplicate the > headers as Ryan showed using curl -I, not with the facebook.net URL. I > will ask our network admins about this. > > What headers are you seeing then? Sounds like your corporate > firewall/proxy may be rewriting headers and content. This of course is one > of the reasons why so many sites are switching to https -- to prevent such > often broken rewriting from taking place. > I learned that we have a corporate gateway in addition to a local firewall. It looks like this problem is boiling down to the combination of a misconfiguration on mirror.facebook.net, and unexpected content processing with possible misconfiguration on our gateway. This explains why all client software, users, and OS types inside our local network get unavoidable unwanted decompression, yet nobody outside sees the problem. My network expert agrees, https rather than http would likely prevent this. Here are the requested headers as seen by curl on my local mac. These have obviously been altered somewhere in transit, compared to what Ryan reported earlier: > curl -I http://mirror.facebook.net/gnu/groff/groff-1.22.4.tar.gz HTTP/1.1 200 OK Via: 1.1 137.75.75.19 (McAfee Web Gateway 7.7.2.13.0.25943) Date: Thu, 04 Apr 2019 22:00:32 GMT Server: Apache Connection: Keep-Alive Content-Type: application/x-gzip Accept-Ranges: bytes Last-Modified: Sun, 23 Dec 2018 15:06:58 GMT I still think there is a misconfiguration of the facebook.net mirror. It is > sending the "Content-Encoding: x-gzip" header. That means the server is > claiming it has gzipped the content prior to sending it and the client > should un-gzip the content upon receiving it. So one of two things happened: > Ryan, I think you are right on about the invalid Content-Encoding header. This was a key observation that helped our diagnosis. > either 1. Facebook gzipped a gzip file, sent it, and the client should > remove the outer gzipping it to leave you with a gzip file. If this is > what's happening, it's stupid because there's no reason to gzip an > already-compressed file; Facebook should stop doing that. Perhaps your > firewall/proxy is misinterpreting this situation and it is removing all > layers of gzip compression, not just the first one. > > or 2. Facebook is transferring the original gzip file to you but claiming > that it compressed it and that the client should ungzip it. If this is > what's happening, it's lying, and I would expect any client to > (undesirably, in this case) decompress it before saving it. I can confirm > that curl does decompress it before saving it if the --compressed flag is > used; I guess curl doesn't react to the Content-Encoding header unless it > requested a compressed response in the first place. Since your client (not > your your firewall/proxy) made the original request, maybe the > firewall/proxy doesn't know (or doesn't keep track of) whether the request > was for a compressed or uncompressed resource, and it just looks at the > Content-Encoding header to decide, which seems like a reasonable decision. > It points out an additional misconfiguration of the facebook.net server: > they're sending (or claiming to send) a compressed response even when the > client did not request it; Facebook should stop doing that and should send > what the client asked it to send. > We can access tar.gz files on other mirror sites without problem. This enhances the case for misconfiguration on mirror.facebook.net. But I still wonder whether there is a possible fix for our local gateway software. I will follow through with both our corporate network admins, and facebook support. Here is one more curiosity. It is only *.gz files that are mishandled. Other compressed types such as *.xz and *.rpm are downloaded correctly.
