Hi jc

<<
I don't understand why there are 19 queues, is it maybe that only 19
websites are being fetched?
>>
Because each queue handles FetchItems which come from the same Queue ID (be
it a proto/hostname or proto/IP or proto/domain pair). And the Queue ID
will be created based on queueMode argument. So here may be there 19
different Queue ID in FetchItemQueues.

<<
 Anyways, why is it that there are 194 spinwaiting out of 200 active
threads?
>>
First of all, i see that the parameter "fetcher.threads.per.host" has been
replaced by "fetcher.threads.per.queue" in nutch 1.6. I see that there are
200 fetching threads that can fetch items from any host. However, all
remaining items are from the different 19 hosts. And total urls count is
10000. Each queue come from the same Queue ID. So the logs indicate that
only 6 threads is fetching and another 13 threads have finished fetching.
Maybe another 13 queues are too small without spend too much time.

Thanks
lufeng


On Fri, Mar 1, 2013 at 6:44 AM, jc <[email protected]> wrote:

> Hi guys,
>
> I'm sorry if this question has been answered before, I looked but didn't
> find anything.
>
> This is my scenario (only relevant settings I think):
> seed urls: about 60 homepages from different domains
> generate.max.count = 10000
> fetcher.threads.per.host = 3   I'm trying to be polite here :-)
> partition.url.mode = byHost
> fetcher.threads.fetch = 200
> fetcher.threads.per.queue = 1
> topN = 1000000
> depth = 1
>
> Since the very beggining I've got a lot of spinwaiting threads (I'm not
> sure
> if those are threads because it doesn't really say in the log)
>
> 194/200 spinwaiting/active, 166 pages, 3 errors, 4.7 3.8 pages/s , 1471
> 1412
> kb/s, 10000 URLs in 19 queues
>
> I don't understand why there are 19 queues, is it maybe that only 19
> websites are being fetched? Anyways, why is it that there are 194
> spinwaiting out of 200 active threads?
>
> Thanks a lot in advance for your time.
>
> Regards,
> jc
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/a-lot-of-threads-spinwaiting-tp4043801.html
> Sent from the Nutch - User mailing list archive at Nabble.com.
>



-- 
Don't Grow Old, Grow Up... :-)

Reply via email to