Re: search an entire website given the homepage URL

Fredrik Lundh Tue, 25 Apr 2006 10:21:36 -0700

"Bell, Kevin" wrote:

> I know I can use urllib2 to get at a website given urllib2.urlopen(url)
> but I'm unsure how to then go through all pages that are linked to it,
> but still in the domain.  If I want to search through the entire python
> website give the homepage, how would I go about it?


use a search engine (try the search box in the upper right corner).

using a spider to download the entire site just so you can "search through
it" is bloody impolite.

if you have a valid reason to download portions of the site, use wget's mirror
function, or some similar tool, and be nice.  there's a tool called "websucker"
in the Tools directory of the standard Python distribution that can also be used
to mirror portions of a site:

    http://svn.python.org/view/python/trunk/Tools/webchecker/

</F>



-- 
http://mail.python.org/mailman/listinfo/python-list

Re: search an entire website given the homepage URL

Reply via email to