In the config file, are you setting the limit_urls_to attribute to match
the start_url attribute? Something like...
start_url: http://www.somesite1.com/stuff/ \
http://www.somesite2.com/otherstuff/
limit_urls_to: http://www.somesite1.com/stuff/ \
http://www.somesite2.com/otherstuff
This should cause htdig to only index pages that include either
http://www.somesite1.com/stuff/ or http://www.somesite2/otherstuff/ in
their full URL.
Jim
Glenn J. Rowe's bits of Sun, 5 Mar 2000 translated to:
>Pardon me. I just started using htdig and just now joined this mailing
>list. I have a question which I am sure someone will be able to answer.
>
>I have specified a rather small list of sites that should be indexed.
>htdig does only index those sites; however, when indexing it follows
>links to sites that aren't in the list. This poses a problem because a
>few sites have a large amount of external links on them and htdig
>follows everyone of those links. It doesn't index them but it follows
>them thus making the indexing process take FOREVER. Is there a way to
>stop that?
>
>Glenn Rowe
>OttawaComputer.Com
>
>
>------------------------------------
>To unsubscribe from the htdig mailing list, send a message to
>[EMAIL PROTECTED]
>You will receive a message to confirm this.
>
>
>
------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.