Hi Roannel, Sebastian,

Thanks for your explanation, it´s for me more clear the fetcher.server.delay
property behavior.

Regards

Andrés

2015-11-23 15:16 GMT-05:00 Sebastian Nagel <[email protected]>:

> Hi Andrés, hi Roannel,
>
> that's correct but the question was why the effective
> delay is "bigger" than the configured 2.5 sec.
>
> Nutch implements the delay as sleeping time after
> one document has been fetched / before the next
> document is fetched. The observed 4-5 sec. include
> the time spent for fetching + the delay.
>
> In case a "Crawl-delay" is specified in robots.txt,
> the configured delay is overwritten by the value
> from robots.txt. Although crawlers (or search engines)
> may differ in the definition of the "Crawl-delay",
> at least, some use it exactly in the sense Nutch does,
> cf.
> https://yandex.com/support/webmaster/controlling-robot/robots-txt.xml#crawl-delay
>
> Sebastian
>
>
> On 11/23/2015 03:52 PM, Roannel Fernández Hernández wrote:
> > Hi Andrés:
> >
> > The fetcher.server.delay property as its description says is the number
> of seconds the fetcher will delay between successive requests to the same
> server. So, if you configure the fetcher.server.delay property with 2.5 as
> value, Nutch will wait for 2.5 seconds to make another request to the same
> server and not between different servers.
> >
> > Regards.
> >
> > ----- Mensaje original -----
> >> De: "Andrés Rincón Pacheco" <[email protected]>
> >> Para: [email protected]
> >> Enviados: Viernes, 20 de Noviembre 2015 18:35:48
> >> Asunto: [MASSMAIL]fetcher.server.delay configuration not working
> >>
> >> Hi,
> >>
> >> I configured the fetcher.server.delay property with 2.5 as value, but
> when
> >> nutch is fetching urls, the time fetching between urls is bigger that
> value
> >> configured.
> >>
> >> I attach some information of execution.
> >>
> >> Date fetching Difference in seconds
> >> 2015-11-11 20:56:49,967 5
> >> 2015-11-11 20:56:54,746 5
> >> 2015-11-11 20:56:59,391 4
> >> 2015-11-11 20:57:04,264 5
> >> 2015-11-11 20:57:09,212 5
> >> 2015-11-11 20:57:13,873 5
> >> 2015-11-11 20:57:18,549 5
> >>
> >> and some lines of log.
> >>
> >> 2015-11-11 20:56:21,674 INFO  fetcher.Fetcher - -activeThreads=50,
> >> spinWaiting=49, fetchQueues.totalSize=359, fetchQueues.getQueueCount=1
> >> 2015-11-11 20:56:22,674 INFO  fetcher.Fetcher - -activeThreads=50,
> >> spinWaiting=49, fetchQueues.totalSize=359, fetchQueues.getQueueCount=1
> >> 2015-11-11 20:56:23,675 INFO  fetcher.Fetcher - -activeThreads=50,
> >> spinWaiting=50, fetchQueues.totalSize=359, fetchQueues.getQueueCount=1
> >> 2015-11-11 20:56:24,676 INFO  fetcher.Fetcher - -activeThreads=50,
> >> spinWaiting=50, fetchQueues.totalSize=359, fetchQueues.getQueueCount=1
> >> 2015-11-11 20:56:25,677 INFO  fetcher.Fetcher - -activeThreads=50,
> >> spinWaiting=50, fetchQueues.totalSize=359, fetchQueues.getQueueCount=1
> >> 2015-11-11 20:56:25,981 INFO  fetcher.Fetcher - fetching
> http://www.abcderer
> >> (queue crawl delay=2500ms)
> >> 2015-11-11 20:56:26,677 INFO  fetcher.Fetcher - -activeThreads=50,
> >> spinWaiting=49, fetchQueues.totalSize=358, fetchQueues.getQueueCount=1
> >> 2015-11-11 20:56:27,678 INFO  fetcher.Fetcher - -activeThreads=50,
> >> spinWaiting=49, fetchQueues.totalSize=358, fetchQueues.getQueueCount=1
> >> 2015-11-11 20:56:28,679 INFO  fetcher.Fetcher - -activeThreads=50,
> >> spinWaiting=50, fetchQueues.totalSize=358, fetchQueues.getQueueCount=1
> >> 2015-11-11 20:56:29,679 INFO  fetcher.Fetcher - -activeThreads=50,
> >> spinWaiting=50, fetchQueues.totalSize=358, fetchQueues.getQueueCount=1
> >> 2015-11-11 20:56:30,680 INFO  fetcher.Fetcher - -activeThreads=50,
> >> spinWaiting=50, fetchQueues.totalSize=358, fetchQueues.getQueueCount=1
> >> 2015-11-11 20:56:30,791 INFO  fetcher.Fetcher - fetching
> http://www.abcderer
> >> (queue crawl delay=2500ms)
> >> 2015-11-11 20:56:31,681 INFO  fetcher.Fetcher - -activeThreads=50,
> >> spinWaiting=49, fetchQueues.totalSize=357, fetchQueues.getQueueCount=1
> >> 2015-11-11 20:56:32,681 INFO  fetcher.Fetcher - -activeThreads=50,
> >> spinWaiting=49, fetchQueues.totalSize=357, fetchQueues.getQueueCount=1
> >> 2015-11-11 20:56:33,682 INFO  fetcher.Fetcher - -activeThreads=50,
> >> spinWaiting=50, fetchQueues.totalSize=357, fetchQueues.getQueueCount=1
> >> 2015-11-11 20:56:34,683 INFO  fetcher.Fetcher - -activeThreads=50,
> >> spinWaiting=50, fetchQueues.totalSize=357, fetchQueues.getQueueCount=1
> >> 2015-11-11 20:56:35,684 INFO  fetcher.Fetcher - -activeThreads=50,
> >> spinWaiting=50, fetchQueues.totalSize=357, fetchQueues.getQueueCount=1
> >> 2015-11-11 20:56:35,960 INFO  fetcher.Fetcher - fetching
> http://www.abcderer
> >> (queue crawl delay=2500ms)
> >> 2015-11-11 20:56:36,684 INFO  fetcher.Fetcher - -activeThreads=50,
> >> spinWaiting=49, fetchQueues.totalSize=356, fetchQueues.getQueueCount=1
> >> 2015-11-11 20:56:37,685 INFO  fetcher.Fetcher - -activeThreads=50,
> >> spinWaiting=49, fetchQueues.totalSize=356, fetchQueues.getQueueCount=1
> >> 2015-11-11 20:56:38,686 INFO  fetcher.Fetcher - -activeThreads=50,
> >> spinWaiting=50, fetchQueues.totalSize=356, fetchQueues.getQueueCount=1
> >> 2015-11-11 20:56:39,687 INFO  fetcher.Fetcher - -activeThreads=50,
> >> spinWaiting=50, fetchQueues.totalSize=356, fetchQueues.getQueueCount=1
> >> 2015-11-11 20:56:40,552 INFO  fetcher.Fetcher - fetching
> http://www.abcderer
> >> (queue crawl delay=2500ms)
> >> 2015-11-11 20:56:40,688 INFO  fetcher.Fetcher - -activeThreads=50,
> >> spinWaiting=49, fetchQueues.totalSize=355, fetchQueues.getQueueCount=1
> >> 2015-11-11 20:56:41,689 INFO  fetcher.Fetcher - -activeThreads=50,
> >> spinWaiting=49, fetchQueues.totalSize=355, fetchQueues.getQueueCount=1
> >> 2015-11-11 20:56:42,690 INFO  fetcher.Fetcher - -activeThreads=50,
> >> spinWaiting=50, fetchQueues.totalSize=355, fetchQueues.getQueueCount=1
> >> 2015-11-11 20:56:43,691 INFO  fetcher.Fetcher - -activeThreads=50,
> >> spinWaiting=50, fetchQueues.totalSize=355, fetchQueues.getQueueCount=1
> >> 2015-11-11 20:56:44,691 INFO  fetcher.Fetcher - -activeThreads=50,
> >> spinWaiting=50, fetchQueues.totalSize=355, fetchQueues.getQueueCount=1
> >> 2015-11-11 20:56:45,128 INFO  fetcher.Fetcher - fetching
> http://www.abcderer
> >> (queue crawl delay=2500ms)
> >> 2015-11-11 20:56:45,692 INFO  fetcher.Fetcher - -activeThreads=50,
> >> spinWaiting=49, fetchQueues.totalSize=354, fetchQueues.getQueueCount=1
> >> 2015-11-11 20:56:46,693 INFO  fetcher.Fetcher - -activeThreads=50,
> >> spinWaiting=49, fetchQueues.totalSize=354, fetchQueues.getQueueCount=1
> >> 2015-11-11 20:56:47,694 INFO  fetcher.Fetcher - -activeThreads=50,
> >> spinWaiting=50, fetchQueues.totalSize=354, fetchQueues.getQueueCount=1
> >> 2015-11-11 20:56:48,695 INFO  fetcher.Fetcher - -activeThreads=50,
> >> spinWaiting=50, fetchQueues.totalSize=354, fetchQueues.getQueueCount=1
> >> 2015-11-11 20:56:49,696 INFO  fetcher.Fetcher - -activeThreads=50,
> >> spinWaiting=50, fetchQueues.totalSize=354, fetchQueues.getQueueCount=1
> >> 2015-11-11 20:56:49,967 INFO  fetcher.Fetcher - fetching
> http://www.abcderer
> >> (queue crawl delay=2500ms)
> >> 2015-11-11 20:56:50,697 INFO  fetcher.Fetcher - -activeThreads=50,
> >> spinWaiting=49, fetchQueues.totalSize=353, fetchQueues.getQueueCount=1
> >> 2015-11-11 20:56:51,698 INFO  fetcher.Fetcher - -activeThreads=50,
> >> spinWaiting=49, fetchQueues.totalSize=353, fetchQueues.getQueueCount=1
> >> 2015-11-11 20:56:52,699 INFO  fetcher.Fetcher - -activeThreads=50,
> >> spinWaiting=50, fetchQueues.totalSize=353, fetchQueues.getQueueCount=1
> >> 2015-11-11 20:56:53,700 INFO  fetcher.Fetcher - -activeThreads=50,
> >> spinWaiting=50, fetchQueues.totalSize=353, fetchQueues.getQueueCount=1
> >> 2015-11-11 20:56:54,701 INFO  fetcher.Fetcher - -activeThreads=50,
> >> spinWaiting=50, fetchQueues.totalSize=353, fetchQueues.getQueueCount=1
> >> 2015-11-11 20:56:54,746 INFO  fetcher.Fetcher - fetching
> http://www.abcderer
> >> (queue crawl delay=2500ms)
> >>
> >> What is the wrong in the configuration?
> >>
> >> Thanks.
> >>
> > Noviembre 13-14: Final Caribeña 2015 del Concurso de Programación
> ACM-ICPC
> > https://icpc.baylor.edu/regionals/finder/cf-2015
> >
>
>

Reply via email to