[ 
https://issues.apache.org/jira/browse/NUTCH-2803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17453087#comment-17453087
 ] 

Hudson commented on NUTCH-2803:
-------------------------------

SUCCESS: Integrated in Jenkins build Nutch ยป Nutch-trunk #55 (See 
[https://ci-builds.apache.org/job/Nutch/job/Nutch-trunk/55/])
NUTCH-2803 Rename property http.robot.rules.whitelist (lewismc: 
[https://github.com/apache/nutch/commit/8971ccc3ec96d80f22373782145e23dc14fba8b9])
* (edit) conf/nutch-default.xml
* (edit) src/java/org/apache/nutch/protocol/RobotRulesParser.java
* (edit) 
src/plugin/protocol-ftp/src/java/org/apache/nutch/protocol/ftp/FtpRobotRulesParser.java
* (edit) 
src/plugin/lib-http/src/java/org/apache/nutch/protocol/http/api/HttpRobotRulesParser.java


> Rename property http.robot.rules.whitelist
> ------------------------------------------
>
>                 Key: NUTCH-2803
>                 URL: https://issues.apache.org/jira/browse/NUTCH-2803
>             Project: Nutch
>          Issue Type: Sub-task
>          Components: configuration, robots
>            Reporter: Sebastian Nagel
>            Assignee: Lewis John McGibbney
>            Priority: Major
>             Fix For: 1.19
>
>
> As part of NUTCH-2802 the property {{http.robot.rules.whitelist}} should be 
> renamed.
> See the [definition of 
> http.robot.rules.whitelist|http://nutch.apache.org/apidocs/apidocs-1.17/resources/nutch-default.xml#http.robot.rules.whitelist]:
> bq. Comma separated list of hostnames or IP addresses to ignore robot rules 
> parsing for. Use with care and only if you are explicitly allowed by the site 
> owner to ignore the site's robots.txt!



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to