Hi, The manual says
"If the local file does not exist, or the sizes of the files do not match, Wget will download the remote file no matter what the time-stamps say." In two cases I'm not seeing this: 1) With if-modified-since I don't believe the content-length is checked 2) Without if-modified-since, if the remote end returns a 416 Here's a quick example $ ./wget http://download.cirros-cloud.net/0.3.4/cirros-0.3.4-x86_64-uec.tar.gz $ truncate -s 10M cirros-0.3.4-x86_64-uec.tar.gz # modify the file size So firstly, when using current git, we see the "If-Modified-Since" request sent, but I guess the server does not look at "Range" because it just returns 304, despite us asking for bytes the file doesn't have. wget doesn't notice that the local file is a different size. --- $ ./wget --debug --timestamping -c http://download.cirros-cloud.net/0.3.4/cirros-0.3.4-x86_64-uec.tar.gz Setting --timestamping (timestamping) to 1 Setting --continue (continue) to 1 DEBUG output created by Wget 1.16.3.90-4e56a on linux-gnu. URI encoding = ‘UTF-8’ --2015-07-28 13:00:28-- http://download.cirros-cloud.net/0.3.4/cirros-0.3.4-x86_64-uec.tar.gz Resolving download.cirros-cloud.net (download.cirros-cloud.net)... 69.163.241.114 Caching download.cirros-cloud.net => 69.163.241.114 Connecting to download.cirros-cloud.net (download.cirros-cloud.net)|69.163.241.114|:80... connected. Created socket 4. Releasing 0x00000000014dc720 (new refcount 1). ---request begin--- GET /0.3.4/cirros-0.3.4-x86_64-uec.tar.gz HTTP/1.1 If-Modified-Since: Tue, 28 Jul 2015 03:00:24 GMT Range: bytes=10485760- User-Agent: Wget/1.16.3.90-4e56a (linux-gnu) Accept: */* Accept-Encoding: identity Host: download.cirros-cloud.net Connection: Keep-Alive ---request end--- HTTP request sent, awaiting response... ---response begin--- HTTP/1.1 304 Not Modified Date: Tue, 28 Jul 2015 03:00:30 GMT Server: Apache Connection: Keep-Alive Keep-Alive: timeout=2, max=100 ETag: "848176-51580ae5ed140" ---response end--- 304 Not Modified Registered socket 4 for persistent reuse. File ‘cirros-0.3.4-x86_64-uec.tar.gz’ not modified on server. Omitting download. --- Using --no-if-modified-since, we see the server does notice the range and returns a 416 (Range Not Satisfiable). --- $ ./wget --debug --no-if-modified-since --timestamping -c http://download.cirros-cloud.net/0.3.4/cirros-0.3.4-x86_64-uec.tar.gz Setting --timestamping (timestamping) to 1 Setting --continue (continue) to 1 DEBUG output created by Wget 1.16.3.90-4e56a on linux-gnu. URI encoding = ‘UTF-8’ --2015-07-28 13:00:41-- http://download.cirros-cloud.net/0.3.4/cirros-0.3.4-x86_64-uec.tar.gz Resolving download.cirros-cloud.net (download.cirros-cloud.net)... 69.163.241.114 Caching download.cirros-cloud.net => 69.163.241.114 Connecting to download.cirros-cloud.net (download.cirros-cloud.net)|69.163.241.114|:80... connected. Created socket 4. Releasing 0x0000000000fbc6c0 (new refcount 1). ---request begin--- HEAD /0.3.4/cirros-0.3.4-x86_64-uec.tar.gz HTTP/1.1 Range: bytes=10485760- User-Agent: Wget/1.16.3.90-4e56a (linux-gnu) Accept: */* Accept-Encoding: identity Host: download.cirros-cloud.net Connection: Keep-Alive ---request end--- HTTP request sent, awaiting response... ---response begin--- HTTP/1.1 416 Requested Range Not Satisfiable Date: Tue, 28 Jul 2015 03:00:41 GMT Server: Apache Vary: Accept-Encoding Keep-Alive: timeout=2, max=100 Connection: Keep-Alive Content-Type: text/html; charset=iso-8859-1 ---response end--- 416 Requested Range Not Satisfiable Registered socket 4 for persistent reuse. URI content encoding = ‘iso-8859-1’ The file is already fully retrieved; nothing to do. --- I feel like wget should take this as an indication there is a difference in file-size and trigger a re-download as documented. I think this happens because I made the local file *larger*, a path that probably isn't often taken. Am I correct in thinking these are bugs, not features? Thanks, -i
