Hi,
Thanks for your prompt reply.

But as per readdb it has 3634 fetched pages.

>> status 1 (db_unfetched):        80475
>> status 2 (db_fetched):  3634

While as per readseg  if i add fetched/parsed pages for all segment it
comes to much more. (1 + 81 + 3691 + 84178 + 84178)

NAME            GENERATED       FETCHER START           FETCHER END
         FETCHED PARSED
20091212212627  1               2009-12-12T21:28:29
2009-12-12T21:28:29     1       1
20091212212951  81              2009-12-12T21:32:20
2009-12-12T21:32:54     105     80
20091212213347  3691            2009-12-12T21:36:13
2009-12-12T22:16:39     3738    3621
20091212222210  84178           2009-12-12T22:24:30
2009-12-13T11:08:28     85189   81806
20091213151344  84178           2009-12-13T15:16:37
2009-12-14T05:50:45     85195   81824

I dont understand does last figure in readseg (81824)  shows count for
that perticular segment (20091213151344 )  or total parsed pages
across all segments????

Thanks
-Bhavin


On Tue, Dec 15, 2009 at 1:20 PM, MilleBii <[email protected]> wrote:
> Every thing seems right.
> Both stats are interesting and it all depends on what you are looking for.
>
> Readdb gives you global stats where readseg is about each segments ie
> fetch/parse run.
>
> 2009/12/15, bhavin pandya <[email protected]>:
>> Hi,
>>
>> I am using Nutch 1.0.
>>
>> For simple excercise i have crawled one single domain and after that i
>> tried both command readdb and readseg...
>> Both showing different figures. Which one i should consider? does
>> something went wrong while crawling?
>>
>> Here is the output of both command.
>>
>> OUTPUT FROM READDB:
>> ----------------------------------------
>> CrawlDb statistics start: crawled/crawldb
>> Statistics for CrawlDb: crawled/crawldb
>> TOTAL urls:     84178
>> retry 0:        84175
>> retry 1:        3
>> min score:      0.0
>> avg score:      7.1693314E-5
>> max score:      1.2
>> status 1 (db_unfetched):        80475
>> status 2 (db_fetched):  3634
>> status 3 (db_gone):     8
>> status 4 (db_redir_temp):       29
>> status 5 (db_redir_perm):       32
>> CrawlDb statistics: done
>>
>>
>> OUTPUT FROM READSEG:
>> -------------------------------------------
>> NAME            GENERATED       FETCHER START           FETCHER END
>>          FETCHED PARSED
>> 20091212212627  1               2009-12-12T21:28:29
>> 2009-12-12T21:28:29     1       1
>> 20091212212951  81              2009-12-12T21:32:20
>> 2009-12-12T21:32:54     105     80
>> 20091212213347  3691            2009-12-12T21:36:13
>> 2009-12-12T22:16:39     3738    3621
>> 20091212222210  84178           2009-12-12T22:24:30
>> 2009-12-13T11:08:28     85189   81806
>> 20091213151344  84178           2009-12-13T15:16:37
>> 2009-12-14T05:50:45     85195   81824
>>
>>
>> Thanks.
>> Bhavin
>>
>
>
> --
> -MilleBii-
>



-- 
- Bhavin

Reply via email to