Update - urgent issue?

Andrzej Bialecki Tue, 16 May 2006 07:58:38 -0700

Lukas Vlcek wrote:

Hi,


I am using nutch0.8-dev. I have a small shell script for
generate/fetch/update cycle. I used generate command with -topN 500.
After crawling about 2000 pages I changed -topN to 3 (yes three pages
only) to see what pages are crawled.

I found that generate/fetch/update cycles are always crawling the same
three pages!
I would expect that it should crawl different pages in every cycle
(and we have more then 3 pages on intranet and I am sure I injected
enough link food).

Can anybody tell me what am I doing wrong?

This indeed sounds strange - looks like their information is not beingupdated in the db. What was the fetch interval for these pages? Couldyou run a readdb -dump before and after updatedb?


--
Best regards,
Andrzej Bialecki     <><
___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com

Re: Generalte/Fetch/Update - urgent issue?

Reply via email to