Hello,

I am running nutch 0.9 currently.
I am running on 4 nodes, one is the
master, in addition to being a slave.

I have injected 100k urls into nutch.
All urls are on the same host.

I am running a generate/fetch/update
cycle with topN set at 100k.

However, after each cycle, it only
fetches between 2588 and 2914 urls
each time.  I have run this over 8
times, all with the same result.

I have tried using nutch fetch and
nutch fetch2.

My hypothesis is, this is due to all
urls being on same host (www.example.com/some/path).

Do I need to set the fetcher.threads.per.host
to something higher than the default of 2?

Is there something in the logs I should
look for to determine the exact cause of
this problem?

Thank you in advance for any assistance
that can be provided.

If you need any additional information,
please let me know and I'll send it.

Thanks!

JohnM

-- 
john mendenhall
[EMAIL PROTECTED]
surf utopia
internet services

Reply via email to