I forgot to mention that no changes were made either in the
crawl-ulrfilter.txt and regex-urlfilter.txt between  a successful crawl and
a crawl with the message "no more urls to fetch"

rootUrlDir = urls/$folder/urls.txt
threads = 10
depth = 1
indexer=lucene
topN = 1500
Injector: starting
Injector: crawlDb: crawl/$folder/crawldb
Injector: urlDir: urls/$folder/urls.txt
Injector: Converting injected urls to crawl db entries.
Injector: Merging injected urls into crawl db.
Injector: done
Generator: Selecting best-scoring urls due for fetch.
Generator: starting
Generator: filtering: true
Generator: normalizing: true
Generator: topN: 1500
Generator: jobtracker is 'local', generating exactly one partition.
Generator: 0 records selected for fetching, exiting ...
Stopping at depth=0 - no more URLs to fetch.
No URLs to fetch - check your seed list and URL filters.
crawl finished: crawl/$folder


Still awaiting a reply...........

--
View this message in context: 
http://lucene.472066.n3.nabble.com/No-more-urls-to-fetch-tp3122462p3124519.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Reply via email to