On Wed, 10 Sep 2003, Andreas Belitz wrote: > Hi, > > i have found a problem regarding wget --spider. > > It works great for any files over http or ftp, but as soon as one of > these two conditions occur, wget starts downloading the file: > > 1. linked files (i'm not 100% sure about this) > 2. download scripts (i.e. http://www.nothing.com/download.php?file=12345&) > > i have included one link that starts downloading even if using the > --spider option: > > > http://club.aopen.com.tw/downloads/Download.asp?RecNo=3587&Section=5&Product=Motherboards&Model=AX59%20Pro&Type=Manual&DownSize=8388 > (MoBo Bios file); > > so this actually starts downloading: > > $ wget --spider > 'http://club.aopen.com.tw/downloads/Download.asp?RecNo=3587&Section=5&Product=Motherboards&Model=AX59%20Pro&Type=Manual&DownSize=8388'
actually, what you call download scripts are actually HTTP redirects, and in this case the redirect is to an FTP server and if you double-check i think you'll find Wget does not know how to spider in ftp. end run-on-sentence. > If there is no conlclusion to this problem using wget can anyone > recommend another "Link-Verifier"? What i want to do is: check the > existence of som 200k links store in a database. So far i was trying > to use "/usr/bin/wget --spider \'" . $link . "\' 2>&1 | tail -2 | head > -1" in a simple php script. I do something similar with Wget (using shell scripting instead), and I am pleased with the outcome. Since you are calling Wget for each link and if you note that Wget does a good job of returning success or failure, you can actually do this.. "wget --spider '$link' || echo '$link' >> badlinks.txt" I can send you my shell scripts if you're interested. /a > > Thanks for any help! -- "Our armies do not come into your cities and lands as conquerors or enemies, but as liberators." - British Lt. Gen. Stanley Maude. "Proclamation to the People of the Wilayat of Baghdad". March 8, 1917.