Hi,

Thanks for your report. It is indeed a bug in Wget, as you've rightfully
investigated. The socket still had some data which caused the next request to
have problems.

I've attached two patches here, the first one fixes the issue. It tries to read
and discard any HTTP body still available and then re-use the socket. The
second patch adds a test case for this scenario

* Iru Cai <mytbk920...@gmail.com> [171208 17:19]:
> Hello wget developers,
> 
> I found an issue when using `wget -c`, as in:
> 
>   https://github.com/mholt/caddy/issues/1965#issuecomment-349220927
> 
> By checking out the wget source code, I can confirm that it doesn't
> drain the response body when it meets a 416 Requested Range Not
> Satisfiable, and then the socket will be reused for the second request
> (http get 2.dat in this case). When parse its response, it will
> encounter the first response's body, so it failed to get the correct
> response header. This is why you get a blank response header.
> 
> Hope this can be fixed.
> 
> Thanks,
> Iru



-- 
Thanking You,
Darshit Shah
PGP Fingerprint: 7845 120B 07CB D8D6 ECE5 FF2B 2A17 43ED A91A 35B6
From a0ffc151036c3d63f153ab3a3d8a30994c47fedf Mon Sep 17 00:00:00 2001
From: Darshit Shah <dar...@gnu.org>
Date: Fri, 8 Dec 2017 18:13:00 +0100
Subject: [PATCH 1/2] Don't assume a 416 response has no body

* http.c(gethttp): In case of a 416 response, try to drain the socket of
any bytes before reusing the connection

Reported-By: Iru Cai <mytbk920...@gmail.com>
---
 src/http.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/src/http.c b/src/http.c
index 95d26258..e4ff0107 100644
--- a/src/http.c
+++ b/src/http.c
@@ -3969,11 +3969,16 @@ gethttp (const struct url *u, struct url *original_url, 
struct http_stat *hs,
       hs->res = 0;
       /* Mark as successfully retrieved. */
       *dt |= RETROKF;
-      if (statcode == HTTP_STATUS_RANGE_NOT_SATISFIABLE)
+
+      /* Try to maintain the keep-alive connection. It is often cheaper to
+       * consume some bytes which have already been sent than to negotiate
+       * a new connection. However, if the body is too large, or we don't
+       * care about keep-alive, then simply terminate the connection */
+      if (keep_alive &&
+          skip_short_body (sock, contlen, chunked_transfer_encoding))
         CLOSE_FINISH (sock);
       else
-        CLOSE_INVALIDATE (sock);        /* would be CLOSE_FINISH, but there
-                                   might be more bytes in the body. */
+        CLOSE_INVALIDATE (sock);
       retval = RETRUNNEEDED;
       goto cleanup;
     }
-- 
2.15.1

From c17b04767a1b58ee8f88889db53af431ef1e63b5 Mon Sep 17 00:00:00 2001
From: Darshit Shah <dar...@gnu.org>
Date: Fri, 8 Dec 2017 18:41:07 +0100
Subject: [PATCH 2/2] Add new test for 416 responses

* testenv/server/http/http_server.py: If there are multiple requests in
which the requested range is unsatisfiable, then send a body in the in
the 2nd response onwards
* testenv/Test-416.py: New test to check how Wget handles 416 responses
---
 testenv/Test-416.py                | 53 ++++++++++++++++++++++++++++++++++++++
 testenv/server/http/http_server.py |  8 ++++++
 2 files changed, 61 insertions(+)
 create mode 100755 testenv/Test-416.py

diff --git a/testenv/Test-416.py b/testenv/Test-416.py
new file mode 100755
index 00000000..76b94213
--- /dev/null
+++ b/testenv/Test-416.py
@@ -0,0 +1,53 @@
+#!/usr/bin/env python3
+from sys import exit
+from test.http_test import HTTPTest
+from misc.wget_file import WgetFile
+
+"""
+    Ensure that Wget behaves well when the server responds with a HTTP 416
+    status code. This test checks both cases:
+        1. Server sends no body
+        2. Server sends a body
+"""
+############# File Definitions ###############################################
+File1 = "abababababababababababababababababababababababababababababababababab"
+File2 = "ababababababababababababababababababab"
+
+A_File = WgetFile ("File1", File1)
+B_File = WgetFile ("File1", File1)
+
+C_File = WgetFile ("File2", File2)
+D_File = WgetFile ("File2", File1)
+
+E_File = WgetFile ("File3", File1)
+
+WGET_OPTIONS = "-c"
+WGET_URLS = [["File1", "File2", "File3"]]
+
+Files = [[A_File, C_File, E_File]]
+Existing_Files = [B_File, D_File]
+
+ExpectedReturnCode = 0
+ExpectedDownloadedFiles = [B_File, D_File, E_File]
+
+################ Pre and Post Test Hooks #####################################
+pre_test = {
+    "ServerFiles"       : Files,
+    "LocalFiles"        : Existing_Files
+}
+test_options = {
+    "WgetCommands"      : WGET_OPTIONS,
+    "Urls"              : WGET_URLS
+}
+post_test = {
+    "ExpectedFiles"     : ExpectedDownloadedFiles,
+    "ExpectedRetcode"   : ExpectedReturnCode
+}
+
+err = HTTPTest (
+                pre_hook=pre_test,
+                test_params=test_options,
+                post_hook=post_test
+).begin ()
+
+exit (err)
diff --git a/testenv/server/http/http_server.py 
b/testenv/server/http/http_server.py
index ffc80ed3..434666dd 100644
--- a/testenv/server/http/http_server.py
+++ b/testenv/server/http/http_server.py
@@ -425,8 +425,16 @@ class _Handler(BaseHTTPRequestHandler):
             except ServerError as ae:
                 # self.log_error("%s", ae.err_message)
                 if ae.err_message == "Range Overflow":
+                    try:
+                        self.overflows += 1
+                    except AttributeError as s:
+                        self.overflows = 0
                     self.send_response(416)
+                    if self.overflows > 0:
+                        self.add_header("Content-Length", 17)
                     self.finish_headers()
+                    if self.overflows > 0:
+                        return("Range Unsatisfied", 0)
                     return(None, None)
                 else:
                     self.range_begin = None
-- 
2.15.1

Attachment: signature.asc
Description: PGP signature

Reply via email to