Re: [Pauldotcom] Looking for a good web spider

Jon Schipp Sat, 25 Sep 2010 10:07:43 -0700

I'm pretty sure you can do all this with wget( --spider,
--save-headers)Also, the list-urls(/pentest/enumeration/list-urls)
script in BT can list all the URL's from a single web page. Scripting time?
I'm sure there is a more efficient way of doing this.


On 09/24/2010 08:46 PM, Adrian Crenshaw wrote:
> Hi all,
>     I'm looking at some of the tools in BT4R1, and will be looking at
> what Samurai WTF has to offer once I finish downloading the latest
> version. I'm looking for some sort of spider that lets me do the following:
> 
> 1. Follow every link on a page, even onto other domains, as long as the
> top level domain name is the same (edu, com, cn, whatever)
> 2. For every page it visits, it collect the file names of all resources.
> 3. The headers so I can see the server version.
> 4. Grab the robots .txt if possible.
> 
> Any ideas on the best tool for the job, or do I need to roll my own?
> 
> Thanks,
> Adrian
> 
> 
> 
> _______________________________________________
> Pauldotcom mailing list
> [email protected]
> http://mail.pauldotcom.com/cgi-bin/mailman/listinfo/pauldotcom
> Main Web Site: http://pauldotcom.com


-- 
- Jon
-- 
------------------------------------------------------------------
Do you OpenGPG? Search the MIT key server with string "jon schipp"
@insightbb.com.

I prefer encrypted mail, when dealing with sensitive data.

Fax & VMB: 206-426-1406

Dubois County Linux User Group - http://www.dclinux.org
BloomingLabs -  http://www.bloominglabs.org
ISSA-Kentuckiana  -  http://issa-kentuckiana.org
_______________________________________________
Pauldotcom mailing list
[email protected]
http://mail.pauldotcom.com/cgi-bin/mailman/listinfo/pauldotcom
Main Web Site: http://pauldotcom.com

Re: [Pauldotcom] Looking for a good web spider

Reply via email to