wget --spider issue
Hi, i have found a problem regarding wget --spider. It works great for any files over http or ftp, but as soon as one of these two conditions occur, wget starts downloading the file: 1. linked files (i'm not 100% sure about this) 2. download scripts (i.e. http://www.nothing.com/download.php?file=12345;) i have included one link that starts downloading even if using the --spider option: http://club.aopen.com.tw/downloads/Download.asp?RecNo=3587Section=5Product=MotherboardsModel=AX59%20ProType=ManualDownSize=8388 (MoBo Bios file); so this actually starts downloading: $ wget --spider 'http://club.aopen.com.tw/downloads/Download.asp?RecNo=3587Section=5Product=MotherboardsModel=AX59%20ProType=ManualDownSize=8388' If there is no conlclusion to this problem using wget can anyone recommend another Link-Verifier? What i want to do is: check the existence of som 200k links store in a database. So far i was trying to use /usr/bin/wget --spider \' . $link . \' 21 | tail -2 | head -1 in a simple php script. Thanks for any help! - Best Regards, Andreas Belitz CIO TCTK - Database Solutions Nordanlage 3 35390 Giessen Germany Phone: +49 (0) 641 3019 446 Fax : +49 (0) 641 3019 535 Mobile : +49 (0) 176 700 16161 E-mail : mailto:[EMAIL PROTECTED] Internet : http://www.tctk.de
Re: wget --spider issue
On Wed, 10 Sep 2003, Andreas Belitz wrote: Hi, i have found a problem regarding wget --spider. It works great for any files over http or ftp, but as soon as one of these two conditions occur, wget starts downloading the file: 1. linked files (i'm not 100% sure about this) 2. download scripts (i.e. http://www.nothing.com/download.php?file=12345;) i have included one link that starts downloading even if using the --spider option: http://club.aopen.com.tw/downloads/Download.asp?RecNo=3587Section=5Product=MotherboardsModel=AX59%20ProType=ManualDownSize=8388 (MoBo Bios file); so this actually starts downloading: $ wget --spider 'http://club.aopen.com.tw/downloads/Download.asp?RecNo=3587Section=5Product=MotherboardsModel=AX59%20ProType=ManualDownSize=8388' actually, what you call download scripts are actually HTTP redirects, and in this case the redirect is to an FTP server and if you double-check i think you'll find Wget does not know how to spider in ftp. end run-on-sentence. If there is no conlclusion to this problem using wget can anyone recommend another Link-Verifier? What i want to do is: check the existence of som 200k links store in a database. So far i was trying to use /usr/bin/wget --spider \' . $link . \' 21 | tail -2 | head -1 in a simple php script. I do something similar with Wget (using shell scripting instead), and I am pleased with the outcome. Since you are calling Wget for each link and if you note that Wget does a good job of returning success or failure, you can actually do this.. wget --spider '$link' || echo '$link' badlinks.txt I can send you my shell scripts if you're interested. /a Thanks for any help! -- Our armies do not come into your cities and lands as conquerors or enemies, but as liberators. - British Lt. Gen. Stanley Maude. Proclamation to the People of the Wilayat of Baghdad. March 8, 1917.
Re: wget --spider issue
Hi Aaron S. Hawley, On Wed, 10. September 2003 you wrote: ASH actually, what you call download scripts are actually HTTP redirects, and ASH in this case the redirect is to an FTP server and if you double-check i ASH think you'll find Wget does not know how to spider in ftp. end ASH run-on-sentence. Ok. This seems to be the reason. Thanks. Is there any way to make wget spider ftp adresses? ASH I can send you my shell scripts if you're interested. ASH /a That would be great! - Mit freundlichen GrĂ¼ssen Andreas Belitz CIO TCTK - Database Solutions Nordanlage 3 35390 Giessen Germany Phone: +49 (0) 641 3019 446 Fax : +49 (0) 641 3019 535 Mobile : +49 (0) 176 700 16161 E-mail : mailto:[EMAIL PROTECTED] Internet : http://www.tctk.de
Re: wget --spider issue
On Wed, 10 Sep 2003, Andreas Belitz wrote: Hi Aaron S. Hawley, On Wed, 10. September 2003 you wrote: ASH actually, what you call download scripts are actually HTTP redirects, and ASH in this case the redirect is to an FTP server and if you double-check i ASH think you'll find Wget does not know how to spider in ftp. end ASH run-on-sentence. Ok. This seems to be the reason. Thanks. Is there any way to make wget spider ftp adresses? I sent a patch to this list over the winter. it's included with the shell scripts i spoke of and have attached to this message. ASH I can send you my shell scripts if you're interested. ASH /a That would be great! gnurls-0.1.tar.gz Description: Binary data