[
https://issues.apache.org/jira/browse/NUTCH-876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andrzej Bialecki updated NUTCH-876:
------------------------------------
Attachment: NUTCH-876.patch
Patch to fix the issue. If there are no objections I'll commit this shortly.
> Remove remaining robots/IP blocking code in lib-http
> ----------------------------------------------------
>
> Key: NUTCH-876
> URL: https://issues.apache.org/jira/browse/NUTCH-876
> Project: Nutch
> Issue Type: Bug
> Components: fetcher
> Affects Versions: 2.0
> Reporter: Andrzej Bialecki
> Assignee: Andrzej Bialecki
> Attachments: NUTCH-876.patch
>
>
> There are remains of the (very old) blocking code in
> lib-http/.../HttpBase.java. This code was used with the OldFetcher to manage
> politeness limits. New trunk doesn't have OldFetcher anymore, so this code is
> useless. Furthermore, there is an actual bug here - FetcherJob forgets to set
> Protocol.CHECK_BLOCKING and Protocol.CHECK_ROBOTS to false, and the defaults
> in lib-http are set to true.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.