[ 
https://issues.apache.org/jira/browse/NUTCH-2946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17539600#comment-17539600
 ] 

Hudson commented on NUTCH-2946:
-------------------------------

SUCCESS: Integrated in Jenkins build Nutch ยป Nutch-trunk #74 (See 
[https://ci-builds.apache.org/job/Nutch/job/Nutch-trunk/74/])
NUTCH-2946 Fetcher: slow down fetching from hosts where requests fail 
repeatedly (snagel: 
[https://github.com/apache/nutch/commit/42ae2a34505e23319861e7b31fd9f87f1af68749])
* (edit) conf/nutch-default.xml
* (edit) src/java/org/apache/nutch/fetcher/FetchItemQueues.java
NUTCH-2946 Fetcher: optionally slow down fetching from hosts with repeated 
exceptions (snagel: 
[https://github.com/apache/nutch/commit/bdbe7b330b5e7fd712f1b5126f69e2efebb194e8])
* (edit) src/java/org/apache/nutch/fetcher/FetchItemQueues.java
* (edit) conf/nutch-default.xml


> Fetcher: optionally slow down fetching from hosts with repeated exceptions
> --------------------------------------------------------------------------
>
>                 Key: NUTCH-2946
>                 URL: https://issues.apache.org/jira/browse/NUTCH-2946
>             Project: Nutch
>          Issue Type: Improvement
>          Components: fetcher
>    Affects Versions: 1.18
>            Reporter: Sebastian Nagel
>            Assignee: Sebastian Nagel
>            Priority: Major
>             Fix For: 1.19
>
>
> The fetcher holds for every fetch queue a counter which counts the number of 
> observed "exceptions" seen when fetching from the host (resp. domain or IP) 
> bound to this queue.
> As an improvement to increase the politeness of the crawler, the counter 
> value could be used to dynamically increase the fetch delay for hosts where 
> requests fail repeatedly with exceptions or HTTP status codes mapped to 
> ProtocolStatus.EXCEPTION (HTTP 403 Forbidden, 429 Too many requests, 5xx 
> server errors, etc.) Of course, this should be optional. The aim to reduce 
> the load on such hosts already before the configured max. number of 
> exceptions (property fetcher.max.exceptions.per.queue) is hit.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to