[ 
https://issues.apache.org/jira/browse/NUTCH-769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12783245#action_12783245
 ] 

Andrzej Bialecki  commented on NUTCH-769:
-----------------------------------------

The patch contains a new method, checkExceptionThreshold,which seems to do the 
right thing, but this method is never used in Fetcher. I think the idea was to 
call it in FetchItemQueues.finishItem()?

> Fetcher to skip queues for URLS getting repeated exceptions  
> -------------------------------------------------------------
>
>                 Key: NUTCH-769
>                 URL: https://issues.apache.org/jira/browse/NUTCH-769
>             Project: Nutch
>          Issue Type: Improvement
>          Components: fetcher
>            Reporter: Julien Nioche
>            Priority: Minor
>         Attachments: NUTCH-769.patch
>
>
> As discussed on the mailing list (see 
> http://www.mail-archive.com/nutch-u...@lucene.apache.org/msg15360.html) this 
> patch allows to clear URLs queues in the Fetcher when more than a set number 
> of exceptions have been encountered in a row. This can speed up the fetching 
> substantially in cases where target hosts are not responsive (as a 
> TimeoutException would be thrown) and limits cases where a whole Fetch step 
> is slowed down because of a few queues.
> by default the parameter fetcher.max.exceptions.per.queue has a value of -1 
> and is deactivated.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to