-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 wang qiang wrote: > Hello there,
Hi. In the future, please use the list ([email protected]) for support requests. I can't promise to answer personal-mail support requests. > When I tested the WGet, I met a question. > > I used the command ./src/wget -r -l6 http://news.yahoo.com > to get the pages, it worked well. > > But I use the command > > ./src/wget -r -l6 http://csce.uark.edu > > it just could get the first page i.e. index.html, and then halted. > > Could you please tell me how to solve this problem? I found that there > was a "robot.txt" in the folder when retrieving from news.yahoo.com, > but no "robot.txt" when retrieving from csce.uark.edu. Thanks, csce.uark.edu includes many links to hosts other than "csce.uark.edu". www.csce.uark.edu, for example, and some others for hosting images I think. Wget by default will refuse to follow links to other hosts; you need to add -H -D csce.uark.edu to get the other links (changing the requested URI to www.csce.uark.edu doesn't help much, because there are many links to csce.uark.edu (without www) as well). - -- Micah J. Cowan Programmer, musician, typesetting enthusiast, gamer. GNU Maintainer: wget, screen, teseq http://micah.cowan.name/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAklk85oACgkQ7M8hyUobTrHdAgCfTYu2QwDJiXW3n1EnhvWq9kar GBIAnjwwTUnUFO7D75bzYhKk5P2FF7hw =4Xjm -----END PGP SIGNATURE-----
