i wont know what those urls are....i basically want to be able to crawl any 
site and tell htdig it can follow any link it finds off the site, but just 
limit how far off the site it goes.

At 04:02 AM 3/6/2001, you wrote:
>Include those 10 sites on your limit_urls_to attribute in the configuration
>file, and set
>
>max_hop_count: 3
>
>--
>David Adams
>Computing Services
>Southampton University
>
>
>----- Original Message -----
>From: "Ian Lipsky" <[EMAIL PROTECTED]>
>To: <[EMAIL PROTECTED]>
>Sent: Tuesday, March 06, 2001 1:17 AM
>Subject: [htdig] crawling off a start site
>
>
> > Is it possible to configure htdig so that it will follow links off the
>main
> > site, but limit how many pages off the start page it will go?
> >
> > for example:
> >
> > say i have www.somesite.com/page.html and page.html has links on it to 10
> > other sites. I would want htdig to crawl those 10 other sites, but only
> > crawl them to a depth of say 3 links off my start page.
> >
> >
> > _______________________________________________
> > htdig-general mailing list <[EMAIL PROTECTED]>
> > To unsubscribe, send a message to
><[EMAIL PROTECTED]> with a subject of unsubscribe
> > FAQ: http://htdig.sourceforge.net/FAQ.html
> >
>
>
>_______________________________________________
>htdig-general mailing list <[EMAIL PROTECTED]>
>To unsubscribe, send a message to 
><[EMAIL PROTECTED]> with a subject of unsubscribe
>FAQ: http://htdig.sourceforge.net/FAQ.html



_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to