Hi, I am using wget to download some webpages, which involve a redirect (HTTP 302). But it seems that wget doesn’t support url redirect, or may be i am missing some option. For example, at my location, www.google.com redirects to www.google.de
Now If i use the url http://www.google.com as an argument to wget command, i get the following output: ******************************************************************************************************************** user:/test> wget -S -t 1 -nd -E --user-agent=Mozilla/4.0 --no-check-certificate -4 -e robots=off -p http://www.google.com asking libproxy about url 'http://www.google.com/' libproxy suggest to use 'direct://' --2012-01-04 16:03:57-- http://www.google.com/ Resolving www.google.com... 173.194.69.103, 173.194.69.104, 173.194.69.105, ... Connecting to www.google.com|173.194.69.103|:80... connected. HTTP request sent, awaiting response... HTTP/1.0 302 Found Location: http://www.google.de/ Cache-Control: private Content-Type: text/html; charset=UTF-8 Set-Cookie: PREF=ID=719837320fac89a6:FF=0:TM=1325689437:LM=1325689437:S=Ix3mGRG5_MVHxwCB; expires=Fri, 03-Jan-2014 15:03:57 GMT; path=/; domain=.google.com Date: Wed, 04 Jan 2012 15:03:57 GMT Server: gws Content-Length: 218 X-XSS-Protection: 1; mode=block X-Frame-Options: SAMEORIGIN Connection: Keep-Alive Location: http://www.google.de/ [following] asking libproxy about url 'http://www.google.de/' libproxy suggest to use 'direct://' --2012-01-04 16:03:57-- http://www.google.de/ Resolving www.google.de... 173.194.69.99, 173.194.69.103, 173.194.69.104, ... Reusing existing connection to www.google.com:80. HTTP request sent, awaiting response... HTTP/1.0 200 OK Date: Wed, 04 Jan 2012 15:03:57 GMT Expires: -1 Cache-Control: private, max-age=0 Content-Type: text/html; charset=ISO-8859-1 Set-Cookie: PREF=ID=c3a1a61eef5a50b6:FF=0:TM=1325689437:LM=1325689437:S=hkRHrWuLAfD0GFMv; expires=Fri, 03-Jan-2014 15:03:57 GMT; path=/; domain=.google.de Set-Cookie: NID=54=KzclGc-p4pi7KOE1thfOBpoHJzV8MLLuPAxmyTK9GliHKcGnjGqmVExySo-N2aI-vzuN9iSoTN4f9D5TPBbQY2LTihcbh5Hu49nUaKsfJXTNwYdiSKHVTSJoVSoJ9syB; expires=Thu, 05-Jul-2012 15:03:57 GMT; path=/; domain=.google.de; HttpOnly P3P: CP="This is not a P3P policy! See http://www.google.com/support/accounts/bin/answer.py?hl=en&answer=151657for more info." Server: gws X-XSS-Protection: 1; mode=block X-Frame-Options: SAMEORIGIN Length: unspecified [text/html] Saving to: “index.html.1.5.html” [ <=> ] 8,892 --.-K/s in 0.02s 2012-01-04 16:03:57 (440 KB/s) - “index.html.1.5.html” saved [8892] FINISHED --2012-01-04 16:03:57-- Downloaded: 1 files, 8.7K in 0.02s (440 KB/s) ******************************************************************************************************************* If i use http://www.google.de as url, then it successfully downloads the web page with the following results: Downloaded: 6 files, 55K in 0.06s (849 KB/s) Please note the difference between downloaded content in case of redirect and no redirect. Same happens with any other url when it involves a redirect with HTTP status code 302. i.e. only 1 html file is downloaded in case of redirect. Kindly suggest me the possible solution of this error. Is it really an error or am i missing something? P.S. I have opensuse version 11.4 and wget version: GNU Wget 1.12 Regards..
