Fetcher to skip queues for URLS getting repeated exceptions  
-------------------------------------------------------------

                 Key: NUTCH-769
                 URL: https://issues.apache.org/jira/browse/NUTCH-769
             Project: Nutch
          Issue Type: Improvement
          Components: fetcher
            Reporter: Julien Nioche
            Priority: Minor


As discussed on the mailing list (see 
http://www.mail-archive.com/nutch-u...@lucene.apache.org/msg15360.html) this 
patch allows to clear URLs queues in the Fetcher when more than a set number of 
exceptions have been encountered in a row. This can speed up the fetching 
substantially in cases where target hosts are not responsive (as a 
TimeoutException would be thrown) and limits cases where a whole Fetch step is 
slowed down because of a few queues.

by default the parameter fetcher.max.exceptions.per.queue has a value of -1 and 
is deactivated.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to