You don't need to check manually if you use the generator return code. It 
returns a non-zero value if no fetch-list is generated, that usually happens 
when there's nothing left to crawl at the moment.

> Hi all,
> 
> has anyone suggestions how I could solve following task:
> 
> I want to crawl a sub-domain of our network completely. I always did it
> by multiple fetch / parse / update cycles manually. After a few cycles I
> checked if there are unfetched pages in the crawldb. If so, I started
> the cycle over again. I repeated that until no new pages were discovered.
> But that is annoying me and that is why I am looking for a way to do
> this steps automatic until no unfetched pages are left.
> 
> Any ideas?

Reply via email to