Doğacan Güney wrote:
Hi,

On 9/18/07, Andrzej Bialecki <[EMAIL PROTECTED]> wrote:
Tim Gautier wrote:

It seems to me that there may be a bug in either updatedb or readdb
-stats.  Can anyone help me out?  I'm really hoping that I'm just
doing something wrong, but I can't figure out what it might be.
I'm trying to track down a similar issue in an existing installation,
which uses a snapshot of the trunk/ as of ~3 months ago, and where we
applied the patches in NUTCH-522 and NUTCH-547. Unfortunately, the
amount of diffs relative to the current trunk/ is several MB large ...

Could you perform the same test with the version before NUTCH-439 ? In
terms of date, this would be before 2007-08-21 12:50:07 +0200, and
before rev. 568053.

Andrzej, do you think that this is somehow related to a commit around
NUTCH-439? Looking at CHANGES.TXT I don't see anything there that can
trigger such a bug.

Right, I'm confused too. In my case, a very similar problem appeared during fetching - before I applied the code from these two issues all worked fine, after I applied the patches, both Fetcher and Fetcher2 started losing urls - i.e. they wouldn't fetch all urls from the fetchlist, only about 1/10th, with no messages in the logs ... This of course later on caused strange results during updatedb.

I would hazard a guess that this is related to adaptive crawl code.
There may still be a bug there that we are missing or one of the later
commits might have broken it.

Right. Let's keep digging ...

--
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com

Reply via email to