[ 
https://issues.apache.org/jira/browse/NUTCH-3114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18004285#comment-18004285
 ] 

Hudson commented on NUTCH-3114:
-------------------------------

SUCCESS: Integrated in Jenkins build Nutch ยป Nutch-trunk #188 (See 
[https://ci-builds.apache.org/job/Nutch/job/Nutch-trunk/188/])
NUTCH-3114 Avoid stale fetching when only URLs (snagel: 
[https://github.com/apache/nutch/commit/14fc3309998ca8d115a5f3d504e1859911660dc5])
* (edit) conf/nutch-default.xml


> Avoid stale fetching when only URLs from queues blocked by the exponential 
> backoff remain 
> ------------------------------------------------------------------------------------------
>
>                 Key: NUTCH-3114
>                 URL: https://issues.apache.org/jira/browse/NUTCH-3114
>             Project: Nutch
>          Issue Type: Bug
>          Components: fetcher
>    Affects Versions: 1.19
>            Reporter: Sebastian Nagel
>            Assignee: Sebastian Nagel
>            Priority: Major
>             Fix For: 1.21
>
>
> The exponential backoff (NUTCH-2946) politely slows down fetching from queues 
> where requests fail repeatedly with exceptions or HTTP status codes (503, 
> 403, 429, etc.) mapped to the protocol status "EXCEPTION".
> However, because the delay grows exponentially. Starting with the default 
> fetch delay of 5 seconds, after the 8th exception the fetcher waits for five 
> minutes. If all "good" queues are exhausted and there is no time limit 
> ({{fetcher.timelimit.mins}}) or minimum throughput 
> ({{fetcher.throughput.threshold.pages}}) configured, this may cause the 
> fetching becomes stale and is finally stopped by the task timeout.
> The default for {{fetcher.max.exceptions.per.queue}} should be set to a 
> reasonable low value, so that queues where requests fail repeatedly with 
> exceptions are purged. With the current default of {{-1}} queues are never 
> purged.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to