[ https://issues.apache.org/jira/browse/NUTCH-769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Julien Nioche updated NUTCH-769: -------------------------------- Attachment: NUTCH-769-2.patch > Fetcher to skip queues for URLS getting repeated exceptions > ------------------------------------------------------------- > > Key: NUTCH-769 > URL: https://issues.apache.org/jira/browse/NUTCH-769 > Project: Nutch > Issue Type: Improvement > Components: fetcher > Reporter: Julien Nioche > Priority: Minor > Attachments: NUTCH-769-2.patch, NUTCH-769.patch > > > As discussed on the mailing list (see > http://www.mail-archive.com/nutch-u...@lucene.apache.org/msg15360.html) this > patch allows to clear URLs queues in the Fetcher when more than a set number > of exceptions have been encountered in a row. This can speed up the fetching > substantially in cases where target hosts are not responsive (as a > TimeoutException would be thrown) and limits cases where a whole Fetch step > is slowed down because of a few queues. > by default the parameter fetcher.max.exceptions.per.queue has a value of -1 > and is deactivated. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.