thanks for your patch! I have looked at it and it seems fine, but in order to accept it into wget, we will have to complete the copyright assignment to the FSF process.
I will send you more details in a separate e-mail. Cheers, Giuseppe Filipe Brandenburger <[email protected]> writes: > Hello, > > Recently I had a problem with wget. This application of mine has data > spread over several HTTP/1.1 servers that know about the others. A > client can query any of the servers, if the server doesn't have the > information it will know which other server has the information and it > will return an HTTP redirect to the URL on the other server. Some of > the queries use POST to specify parameters to which data they want, > but those are also subject to receiving HTTP redirects, in which case > the POST should be repeated on the next server. > > Usually after an HTTP redirect the client will repeat the query with a > GET, that's the Post/Redirect/Get pattern > (http://en.wikipedia.org/wiki/Post/Redirect/Get) used by web forms to > send the user to another web page instead of generating the HTML > content on the submit URL. To solve this ambiguity, HTTP/1.1 > introduced status code 307 that indicates that the server expects the > client to try the next URL but using the same method (see > http://en.wikipedia.org/wiki/List_of_HTTP_status_codes#3xx_Redirection, > RFC2616 is the one that defines this new code, but unfortunately I > found it to be not very explicit about this behaviour). So those > redirects I was referring to above are all implemented as 307 > Redirects. > > When using an HTML form in Firefox this works just fine, but I was > trying to automate it and I noticed that wget doesn't work with that. > I tried curl and saw that curl handles the 307 Redirects correctly, so > for the time I had to resort to using curl to implement my scripts for > now, which is not ideal since wget is my tool of choice... > > So I decided to fix the issue in wget, to make it behave like both > Firefox and curl, and to respect the "spirit" (if not the "letter") of > the RFC. > > Attached to this e-mail you will find a patch created over the latest > (at this time) wget 1.12-2443 from the bzr repository. > > Also attached to this e-mail there is a tarball with some files to > help test the issue. The wget307test.tgz file should be unpacked > directly under /var/www (or whatever the Apache root htdocs directory > is). There's an .htaccess that will set up all that needs setting up > as long as "AllowOverride all" is set for that directory in the main > Apache config file. actual.cgi is a Perl CGI that will receive the > redirected requests (redirect from /wget307test/redirect.cgi with code > 307 also implemented inside .htaccess) and test if it worked or not. > testform.html is a form that can be used to test the submit to an URL > that will return a 307 Redirect from a web browser such as Firefox. > testcurl.sh is a shell script that will do the test using curl (I > tested it with curl 7.19.7 and it works). testwget.sh is a shell > script that will do the test using wget (I tested it with vanilla 1.12 > or even unpatched 1.12-2443 from bzr and it does not work). The output > of the CGI (which is what each test displays) is textual and will > print a line indicating if the test worked or not based on a submitted > parameter (that will be lost if the POST was translated to a GET as in > wget's case). It will also print another submitted variable (a little > sanity check for the CGI) and which method (GET or POST) was used for > the request to the CGI. > > I also updated the documentation (wget.texi used to generate all > others including man page) and the ChangeLog, but I may have forgotten > something, feel free to change anything in the patch that you feel > could be done better. > > I hope this helps, and I really hope to see this fix included in the > next official release of wget! :-D > > Keep up the great work building this awesome web client tool! > > Cheers, > Filipe
