Hello,
I assume  second counts are printed by some tool accessing WebD. Right?
If so - 2 250 000 is the number of pages generated to be fetched (so all fetched pages, fetch attempts with error) - simply total number of pages in segments. The second number is amount of Pages/Links in WebDB - pages /links known to nutch gathered by extracting links from already fetched pages. Some of these pages have been already fetched but some of them are to be fetched in future.
Regards
Piotr

Ilia S. Yatsenko wrote:
Hello Sorry my little English
How nutch count document in search index?

I have 90 segments with 25000 in each segment

Total is 2 250 000 pages in index (this number I see when execute
mergesegs).

But in the same time nutch report me:

Number of pages: 4318557
Number of links: 5541456
Why I see in 2 times more pages than I have in real index?


Reply via email to