[
https://issues.apache.org/jira/browse/NUTCH-2803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17453087#comment-17453087
]
Hudson commented on NUTCH-2803:
-------------------------------
SUCCESS: Integrated in Jenkins build Nutch ยป Nutch-trunk #55 (See
[https://ci-builds.apache.org/job/Nutch/job/Nutch-trunk/55/])
NUTCH-2803 Rename property http.robot.rules.whitelist (lewismc:
[https://github.com/apache/nutch/commit/8971ccc3ec96d80f22373782145e23dc14fba8b9])
* (edit) conf/nutch-default.xml
* (edit) src/java/org/apache/nutch/protocol/RobotRulesParser.java
* (edit)
src/plugin/protocol-ftp/src/java/org/apache/nutch/protocol/ftp/FtpRobotRulesParser.java
* (edit)
src/plugin/lib-http/src/java/org/apache/nutch/protocol/http/api/HttpRobotRulesParser.java
> Rename property http.robot.rules.whitelist
> ------------------------------------------
>
> Key: NUTCH-2803
> URL: https://issues.apache.org/jira/browse/NUTCH-2803
> Project: Nutch
> Issue Type: Sub-task
> Components: configuration, robots
> Reporter: Sebastian Nagel
> Assignee: Lewis John McGibbney
> Priority: Major
> Fix For: 1.19
>
>
> As part of NUTCH-2802 the property {{http.robot.rules.whitelist}} should be
> renamed.
> See the [definition of
> http.robot.rules.whitelist|http://nutch.apache.org/apidocs/apidocs-1.17/resources/nutch-default.xml#http.robot.rules.whitelist]:
> bq. Comma separated list of hostnames or IP addresses to ignore robot rules
> parsing for. Use with care and only if you are explicitly allowed by the site
> owner to ignore the site's robots.txt!
--
This message was sent by Atlassian Jira
(v8.20.1#820001)