Hello everyone,

I try to "sync" single files from webservers. That means I want to re-download 
the file when
- the file on the server is newer than the file on the local machine
- the file size on the server differs from the local file size.

To achieve this I use this invocation (with wget 1.21.3):
wget -N --no-if-modified-since --continue -P <local directory> <HTTP-URL>

The first download is ok, but once the file is already downloaded I get a:


---

HTTP request sent, awaiting response... 416 Requested Range Not Satisfiable

    The file is already fully retrieved; nothing to do.

---

and the process hangs there for 60+ seconds.
Using wireshark I see that the server sends the 416 message, wget responds with 
an [ACK] packet and
then both sides fall silent for ~65s and then the server sends a [FIN, ACK] 
packet.

Diving into the source I get to src/http.c to the section from line 4030 
onwards.
Based on printf debugging I believe the part which hangs is the call to 
skip_short_body().
Therefore I tested adding --no-http-keep-alive to the command and then the 
command finishes in an instant.

At this point I need your help to understand the intent of the call to 
skip_short_body().
In my use case, the file is already fully downloaded, so the requested file 
range is zero and the server responds with
the 416. Why does wget expect some short body arriving in this situation? I can 
imagine this is code for a different use
case, but maybe you can shine some competent light on it?

If the code is good, it's ok for me, I learned that with --no-http-keep-alive I 
can accelerate my use case, but maybe
this is really a bug somehow?


Best regards,
Michi

Reply via email to