Hi,

I am using nutch 1.9, after review the urls added by the Injector the total
url is 25146.
(Log evidence)
crawl.Injector - Injector: Total number of urls after normalization: 25146

When I was checking the log file only 7003 urls was fetched and 6727 urls
was parsed.

And these are the statistics:

CrawlDb statistics start: ../crawlInfo/crawldb
Statistics for CrawlDb: ../crawlInfo/crawldb
TOTAL urls:     30914
retry 0:        30913
retry 1:        1
min score:      0.0
avg score:      0.4359605
max score:      100.002
status 1 (db_unfetched):        23912
status 2 (db_fetched):  6727
status 3 (db_gone):     8
status 4 (db_redir_temp):       266
status 5 (db_redir_perm):       1
CrawlDb statistics: done

Why only the third part (approximately) urls is fetched and parsed?

Thanks.

Reply via email to