Fetcher to skip queues for URLS getting repeated exceptions -------------------------------------------------------------
Key: NUTCH-769 URL: https://issues.apache.org/jira/browse/NUTCH-769 Project: Nutch Issue Type: Improvement Components: fetcher Reporter: Julien Nioche Priority: Minor As discussed on the mailing list (see http://www.mail-archive.com/nutch-u...@lucene.apache.org/msg15360.html) this patch allows to clear URLs queues in the Fetcher when more than a set number of exceptions have been encountered in a row. This can speed up the fetching substantially in cases where target hosts are not responsive (as a TimeoutException would be thrown) and limits cases where a whole Fetch step is slowed down because of a few queues. by default the parameter fetcher.max.exceptions.per.queue has a value of -1 and is deactivated. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.