I've merged the above patches to master. They will be available with the next version of Wget
* Darshit Shah <[email protected]> [171208 18:47]: > Hi, > > Thanks for your report. It is indeed a bug in Wget, as you've rightfully > investigated. The socket still had some data which caused the next request to > have problems. > > I've attached two patches here, the first one fixes the issue. It tries to > read > and discard any HTTP body still available and then re-use the socket. The > second patch adds a test case for this scenario > > * Iru Cai <[email protected]> [171208 17:19]: > > Hello wget developers, > > > > I found an issue when using `wget -c`, as in: > > > > https://github.com/mholt/caddy/issues/1965#issuecomment-349220927 > > > > By checking out the wget source code, I can confirm that it doesn't > > drain the response body when it meets a 416 Requested Range Not > > Satisfiable, and then the socket will be reused for the second request > > (http get 2.dat in this case). When parse its response, it will > > encounter the first response's body, so it failed to get the correct > > response header. This is why you get a blank response header. > > > > Hope this can be fixed. > > > > Thanks, > > Iru > > > > -- > Thanking You, > Darshit Shah > PGP Fingerprint: 7845 120B 07CB D8D6 ECE5 FF2B 2A17 43ED A91A 35B6 > From a0ffc151036c3d63f153ab3a3d8a30994c47fedf Mon Sep 17 00:00:00 2001 > From: Darshit Shah <[email protected]> > Date: Fri, 8 Dec 2017 18:13:00 +0100 > Subject: [PATCH 1/2] Don't assume a 416 response has no body > > * http.c(gethttp): In case of a 416 response, try to drain the socket of > any bytes before reusing the connection > > Reported-By: Iru Cai <[email protected]> > --- > src/http.c | 11 ++++++++--- > 1 file changed, 8 insertions(+), 3 deletions(-) > > diff --git a/src/http.c b/src/http.c > index 95d26258..e4ff0107 100644 > --- a/src/http.c > +++ b/src/http.c > @@ -3969,11 +3969,16 @@ gethttp (const struct url *u, struct url > *original_url, struct http_stat *hs, > hs->res = 0; > /* Mark as successfully retrieved. */ > *dt |= RETROKF; > - if (statcode == HTTP_STATUS_RANGE_NOT_SATISFIABLE) > + > + /* Try to maintain the keep-alive connection. It is often cheaper to > + * consume some bytes which have already been sent than to negotiate > + * a new connection. However, if the body is too large, or we don't > + * care about keep-alive, then simply terminate the connection */ > + if (keep_alive && > + skip_short_body (sock, contlen, chunked_transfer_encoding)) > CLOSE_FINISH (sock); > else > - CLOSE_INVALIDATE (sock); /* would be CLOSE_FINISH, but there > - might be more bytes in the body. */ > + CLOSE_INVALIDATE (sock); > retval = RETRUNNEEDED; > goto cleanup; > } > -- > 2.15.1 > > From c17b04767a1b58ee8f88889db53af431ef1e63b5 Mon Sep 17 00:00:00 2001 > From: Darshit Shah <[email protected]> > Date: Fri, 8 Dec 2017 18:41:07 +0100 > Subject: [PATCH 2/2] Add new test for 416 responses > > * testenv/server/http/http_server.py: If there are multiple requests in > which the requested range is unsatisfiable, then send a body in the in > the 2nd response onwards > * testenv/Test-416.py: New test to check how Wget handles 416 responses > --- > testenv/Test-416.py | 53 > ++++++++++++++++++++++++++++++++++++++ > testenv/server/http/http_server.py | 8 ++++++ > 2 files changed, 61 insertions(+) > create mode 100755 testenv/Test-416.py > > diff --git a/testenv/Test-416.py b/testenv/Test-416.py > new file mode 100755 > index 00000000..76b94213 > --- /dev/null > +++ b/testenv/Test-416.py > @@ -0,0 +1,53 @@ > +#!/usr/bin/env python3 > +from sys import exit > +from test.http_test import HTTPTest > +from misc.wget_file import WgetFile > + > +""" > + Ensure that Wget behaves well when the server responds with a HTTP 416 > + status code. This test checks both cases: > + 1. Server sends no body > + 2. Server sends a body > +""" > +############# File Definitions > ############################################### > +File1 = > "abababababababababababababababababababababababababababababababababab" > +File2 = "ababababababababababababababababababab" > + > +A_File = WgetFile ("File1", File1) > +B_File = WgetFile ("File1", File1) > + > +C_File = WgetFile ("File2", File2) > +D_File = WgetFile ("File2", File1) > + > +E_File = WgetFile ("File3", File1) > + > +WGET_OPTIONS = "-c" > +WGET_URLS = [["File1", "File2", "File3"]] > + > +Files = [[A_File, C_File, E_File]] > +Existing_Files = [B_File, D_File] > + > +ExpectedReturnCode = 0 > +ExpectedDownloadedFiles = [B_File, D_File, E_File] > + > +################ Pre and Post Test Hooks > ##################################### > +pre_test = { > + "ServerFiles" : Files, > + "LocalFiles" : Existing_Files > +} > +test_options = { > + "WgetCommands" : WGET_OPTIONS, > + "Urls" : WGET_URLS > +} > +post_test = { > + "ExpectedFiles" : ExpectedDownloadedFiles, > + "ExpectedRetcode" : ExpectedReturnCode > +} > + > +err = HTTPTest ( > + pre_hook=pre_test, > + test_params=test_options, > + post_hook=post_test > +).begin () > + > +exit (err) > diff --git a/testenv/server/http/http_server.py > b/testenv/server/http/http_server.py > index ffc80ed3..434666dd 100644 > --- a/testenv/server/http/http_server.py > +++ b/testenv/server/http/http_server.py > @@ -425,8 +425,16 @@ class _Handler(BaseHTTPRequestHandler): > except ServerError as ae: > # self.log_error("%s", ae.err_message) > if ae.err_message == "Range Overflow": > + try: > + self.overflows += 1 > + except AttributeError as s: > + self.overflows = 0 > self.send_response(416) > + if self.overflows > 0: > + self.add_header("Content-Length", 17) > self.finish_headers() > + if self.overflows > 0: > + return("Range Unsatisfied", 0) > return(None, None) > else: > self.range_begin = None > -- > 2.15.1 > -- Thanking You, Darshit Shah PGP Fingerprint: 7845 120B 07CB D8D6 ECE5 FF2B 2A17 43ED A91A 35B6
signature.asc
Description: PGP signature
