I'm pretty sure you can do all this with wget( --spider, --save-headers)Also, the list-urls(/pentest/enumeration/list-urls) script in BT can list all the URL's from a single web page. Scripting time? I'm sure there is a more efficient way of doing this.
On 09/24/2010 08:46 PM, Adrian Crenshaw wrote: > Hi all, > I'm looking at some of the tools in BT4R1, and will be looking at > what Samurai WTF has to offer once I finish downloading the latest > version. I'm looking for some sort of spider that lets me do the following: > > 1. Follow every link on a page, even onto other domains, as long as the > top level domain name is the same (edu, com, cn, whatever) > 2. For every page it visits, it collect the file names of all resources. > 3. The headers so I can see the server version. > 4. Grab the robots .txt if possible. > > Any ideas on the best tool for the job, or do I need to roll my own? > > Thanks, > Adrian > > > > _______________________________________________ > Pauldotcom mailing list > [email protected] > http://mail.pauldotcom.com/cgi-bin/mailman/listinfo/pauldotcom > Main Web Site: http://pauldotcom.com -- - Jon -- ------------------------------------------------------------------ Do you OpenGPG? Search the MIT key server with string "jon schipp" @insightbb.com. I prefer encrypted mail, when dealing with sensitive data. Fax & VMB: 206-426-1406 Dubois County Linux User Group - http://www.dclinux.org BloomingLabs - http://www.bloominglabs.org ISSA-Kentuckiana - http://issa-kentuckiana.org _______________________________________________ Pauldotcom mailing list [email protected] http://mail.pauldotcom.com/cgi-bin/mailman/listinfo/pauldotcom Main Web Site: http://pauldotcom.com
