Hi,

I am new to Nutch and trying to figure out the best configuration for
crawling. In my Nutch configuration. I have configured *f*
*etcher.threads.per.**queue* to be 2 and  fetcher.server.min.delay to be 2
seconds.

So as per my understanding from documentation in this case Nutch won't do
more than 2 requests per each 2 seconds.  So the requests per second will
be averaged to 1 request per second.

Is this correct? Even though in this case it ignore robots.txt. I assume 1
req/sec is a polite request rate for many servers.

It will be great if you could give me a clarification.


Thanks,
Charith


-- 
Charith Dhanushka Wickramaarachchi

Tel  +1 213 447 4253
Blog  http://charith.wickramaarachchi.org/
<http://charithwiki.blogspot.com/>
Twitter  @charithwiki <https://twitter.com/charithwiki>

This communication may contain privileged or other confidential information
and is intended exclusively for the addressee/s. If you are not the
intended recipient/s, or believe that you may have
received this communication in error, please reply to the sender indicating
that fact and delete the copy you received and in addition, you should not
print, copy, retransmit, disseminate, or otherwise use the information
contained in this communication. Internet communications cannot be
guaranteed to be timely, secure, error or virus-free. The sender does not
accept liability for any errors or omissions

Reply via email to