Re: [Pauldotcom] Looking for a good web spider

Matt Erasmus Sat, 25 Sep 2010 10:08:36 -0700

Howdy

On 25 September 2010 02:46, Adrian Crenshaw <[email protected]> wrote:
>     I'm looking at some of the tools in BT4R1, and will be looking at what
> Samurai WTF has to offer once I finish downloading the latest version. I'm
> looking for some sort of spider that lets me do the following:
>
> 1. Follow every link on a page, even onto other domains, as long as the top
> level domain name is the same (edu, com, cn, whatever)
> 2. For every page it visits, it collect the file names of all resources.
> 3. The headers so I can see the server version.
> 4. Grab the robots .txt if possible.


I'd probably stick with wget and a simple bit of bash scripting.

      wget --spider -r -o log.txt http://myballsaresore.com

-- 
Matt Erasmus <[email protected]>
@z0nbi
_______________________________________________
Pauldotcom mailing list
[email protected]
http://mail.pauldotcom.com/cgi-bin/mailman/listinfo/pauldotcom
Main Web Site: http://pauldotcom.com

Re: [Pauldotcom] Looking for a good web spider

Reply via email to