Doğacan Güney wrote:
On 5/18/07, Andrzej Bialecki <[EMAIL PROTECTED]> wrote:
Doğacan Güney wrote:
> Hi everyone,
>
> Has anyone tried Fetcher2 from latest trunk? On our tests, Fetcher2 is
> always slower (by a large margin) that Fetcher.
>
> For a segment with ~30000 urls, we ran Fetcher with 150 threads and
> Fetcher2 with 50 threads. Fetcher finishes around 1 hour, while
> Fetcher2 takes around 4 hours. We ran this test more than once and
> got similar results.
>
> Are we running Fetcher2 with too few/too many threads? I was under the
> impression that Fetcher2 doesn't need as many threads as Fetcher since
> threads do not block.
Yes, that was the idea. Could you test it with the same number of
threads? Is the configuration identical in all other aspects?
Yes, it is identical in other aspects. I am currently testing with
same number of threads. Will report if there is a difference.
Are you running the version with the fix from NUTCH-474?
>
> Any suggestions?
>
If you already have a setup to reproduce this, you could perhaps spend
some time debugging this ... add some timing info, and queue info
logging.
What do you think would be a good place(or places) to add debug info?
Looking at the code I am not sure where to add them?
FetchItemQueues.getFetchItem() and FetchItemQueue.getFetchItem() would
be good places to start - the logging here would show how frequently
they are called, and why fetch items are not picked up (perhaps
per-queue blocking is buggy?).
--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __________________________________
[__ || __|__/|__||\/| Information Retrieval, Semantic Web
___|||__|| \| || | Embedded Unix, System Integration
http://www.sigram.com Contact: info at sigram dot com