Every thing seems right. Both stats are interesting and it all depends on what you are looking for.
Readdb gives you global stats where readseg is about each segments ie fetch/parse run. 2009/12/15, bhavin pandya <[email protected]>: > Hi, > > I am using Nutch 1.0. > > For simple excercise i have crawled one single domain and after that i > tried both command readdb and readseg... > Both showing different figures. Which one i should consider? does > something went wrong while crawling? > > Here is the output of both command. > > OUTPUT FROM READDB: > ---------------------------------------- > CrawlDb statistics start: crawled/crawldb > Statistics for CrawlDb: crawled/crawldb > TOTAL urls: 84178 > retry 0: 84175 > retry 1: 3 > min score: 0.0 > avg score: 7.1693314E-5 > max score: 1.2 > status 1 (db_unfetched): 80475 > status 2 (db_fetched): 3634 > status 3 (db_gone): 8 > status 4 (db_redir_temp): 29 > status 5 (db_redir_perm): 32 > CrawlDb statistics: done > > > OUTPUT FROM READSEG: > ------------------------------------------- > NAME GENERATED FETCHER START FETCHER END > FETCHED PARSED > 20091212212627 1 2009-12-12T21:28:29 > 2009-12-12T21:28:29 1 1 > 20091212212951 81 2009-12-12T21:32:20 > 2009-12-12T21:32:54 105 80 > 20091212213347 3691 2009-12-12T21:36:13 > 2009-12-12T22:16:39 3738 3621 > 20091212222210 84178 2009-12-12T22:24:30 > 2009-12-13T11:08:28 85189 81806 > 20091213151344 84178 2009-12-13T15:16:37 > 2009-12-14T05:50:45 85195 81824 > > > Thanks. > Bhavin > -- -MilleBii-
