[ https://issues.apache.org/jira/browse/NUTCH-769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12784260#action_12784260 ]
Andrzej Bialecki commented on NUTCH-769: ----------------------------------------- I had to apply this patch by hand, due to NUTCH-770. I also added conf/nutch-default.xml documentation. This was committed in rev. 885785 - thanks! > Fetcher to skip queues for URLS getting repeated exceptions > ------------------------------------------------------------- > > Key: NUTCH-769 > URL: https://issues.apache.org/jira/browse/NUTCH-769 > Project: Nutch > Issue Type: Improvement > Components: fetcher > Reporter: Julien Nioche > Assignee: Andrzej Bialecki > Priority: Minor > Fix For: 1.1 > > Attachments: NUTCH-769-2.patch, NUTCH-769.patch > > > As discussed on the mailing list (see > http://www.mail-archive.com/nutch-u...@lucene.apache.org/msg15360.html) this > patch allows to clear URLs queues in the Fetcher when more than a set number > of exceptions have been encountered in a row. This can speed up the fetching > substantially in cases where target hosts are not responsive (as a > TimeoutException would be thrown) and limits cases where a whole Fetch step > is slowed down because of a few queues. > by default the parameter fetcher.max.exceptions.per.queue has a value of -1 > and is deactivated. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.