Hi,
I am using Nutch 1.0.
For simple excercise i have crawled one single domain and after that i
tried both command readdb and readseg...
Both showing different figures. Which one i should consider? does
something went wrong while crawling?
Here is the output of both command.
OUTPUT FROM READDB:
----------------------------------------
CrawlDb statistics start: crawled/crawldb
Statistics for CrawlDb: crawled/crawldb
TOTAL urls: 84178
retry 0: 84175
retry 1: 3
min score: 0.0
avg score: 7.1693314E-5
max score: 1.2
status 1 (db_unfetched): 80475
status 2 (db_fetched): 3634
status 3 (db_gone): 8
status 4 (db_redir_temp): 29
status 5 (db_redir_perm): 32
CrawlDb statistics: done
OUTPUT FROM READSEG:
-------------------------------------------
NAME GENERATED FETCHER START FETCHER END
FETCHED PARSED
20091212212627 1 2009-12-12T21:28:29
2009-12-12T21:28:29 1 1
20091212212951 81 2009-12-12T21:32:20
2009-12-12T21:32:54 105 80
20091212213347 3691 2009-12-12T21:36:13
2009-12-12T22:16:39 3738 3621
20091212222210 84178 2009-12-12T22:24:30
2009-12-13T11:08:28 85189 81806
20091213151344 84178 2009-12-13T15:16:37
2009-12-14T05:50:45 85195 81824
Thanks.
Bhavin