Hi,
Thanks for your prompt reply.
But as per readdb it has 3634 fetched pages.
>> status 1 (db_unfetched): 80475
>> status 2 (db_fetched): 3634
While as per readseg if i add fetched/parsed pages for all segment it
comes to much more. (1 + 81 + 3691 + 84178 + 84178)
NAME GENERATED FETCHER START FETCHER END
FETCHED PARSED
20091212212627 1 2009-12-12T21:28:29
2009-12-12T21:28:29 1 1
20091212212951 81 2009-12-12T21:32:20
2009-12-12T21:32:54 105 80
20091212213347 3691 2009-12-12T21:36:13
2009-12-12T22:16:39 3738 3621
20091212222210 84178 2009-12-12T22:24:30
2009-12-13T11:08:28 85189 81806
20091213151344 84178 2009-12-13T15:16:37
2009-12-14T05:50:45 85195 81824
I dont understand does last figure in readseg (81824) shows count for
that perticular segment (20091213151344 ) or total parsed pages
across all segments????
Thanks
-Bhavin
On Tue, Dec 15, 2009 at 1:20 PM, MilleBii <[email protected]> wrote:
> Every thing seems right.
> Both stats are interesting and it all depends on what you are looking for.
>
> Readdb gives you global stats where readseg is about each segments ie
> fetch/parse run.
>
> 2009/12/15, bhavin pandya <[email protected]>:
>> Hi,
>>
>> I am using Nutch 1.0.
>>
>> For simple excercise i have crawled one single domain and after that i
>> tried both command readdb and readseg...
>> Both showing different figures. Which one i should consider? does
>> something went wrong while crawling?
>>
>> Here is the output of both command.
>>
>> OUTPUT FROM READDB:
>> ----------------------------------------
>> CrawlDb statistics start: crawled/crawldb
>> Statistics for CrawlDb: crawled/crawldb
>> TOTAL urls: 84178
>> retry 0: 84175
>> retry 1: 3
>> min score: 0.0
>> avg score: 7.1693314E-5
>> max score: 1.2
>> status 1 (db_unfetched): 80475
>> status 2 (db_fetched): 3634
>> status 3 (db_gone): 8
>> status 4 (db_redir_temp): 29
>> status 5 (db_redir_perm): 32
>> CrawlDb statistics: done
>>
>>
>> OUTPUT FROM READSEG:
>> -------------------------------------------
>> NAME GENERATED FETCHER START FETCHER END
>> FETCHED PARSED
>> 20091212212627 1 2009-12-12T21:28:29
>> 2009-12-12T21:28:29 1 1
>> 20091212212951 81 2009-12-12T21:32:20
>> 2009-12-12T21:32:54 105 80
>> 20091212213347 3691 2009-12-12T21:36:13
>> 2009-12-12T22:16:39 3738 3621
>> 20091212222210 84178 2009-12-12T22:24:30
>> 2009-12-13T11:08:28 85189 81806
>> 20091213151344 84178 2009-12-13T15:16:37
>> 2009-12-14T05:50:45 85195 81824
>>
>>
>> Thanks.
>> Bhavin
>>
>
>
> --
> -MilleBii-
>
--
- Bhavin