[jira] [Commented] (NUTCH-2002) ParserChecker and IndexingFiltersChecker to check robots.txt
[ https://issues.apache.org/jira/browse/NUTCH-2002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17178608#comment-17178608 ] Hudson commented on NUTCH-2002: --- SUCCESS: Integrated in Jenkins build Nutch ยป Nutch-trunk #3 (See [https://ci-builds.apache.org/job/Nutch/job/Nutch-trunk/3/]) NUTCH-2002 parse and index checkers to check robots.txt (snagel: [https://github.com/apache/nutch/commit/aed6fa71fa7cd07740235e4c4aeca8380ddb9b48]) * (edit) src/java/org/apache/nutch/indexer/IndexingFiltersChecker.java * (edit) src/java/org/apache/nutch/util/AbstractChecker.java * (edit) src/java/org/apache/nutch/parse/ParserChecker.java > ParserChecker and IndexingFiltersChecker to check robots.txt > > > Key: NUTCH-2002 > URL: https://issues.apache.org/jira/browse/NUTCH-2002 > Project: Nutch > Issue Type: Improvement > Components: parser >Affects Versions: 1.9 >Reporter: Julien Nioche >Assignee: Sebastian Nagel >Priority: Minor > Fix For: 1.17 > > Attachments: NUTCH-2002.patch > > > ParserChecker could check whether a given URL is allowed by the robots.txt > directives. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (NUTCH-2002) ParserChecker and IndexingFiltersChecker to check robots.txt
[ https://issues.apache.org/jira/browse/NUTCH-2002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17099977#comment-17099977 ] Hudson commented on NUTCH-2002: --- SUCCESS: Integrated in Jenkins build Nutch-trunk #3681 (See [https://builds.apache.org/job/Nutch-trunk/3681/]) NUTCH-2002 parse and index checkers to check robots.txt - applied (snagel: [https://github.com/apache/nutch/commit/46db3ed71355fefda42a008ece75094f51859ab2]) * (edit) src/java/org/apache/nutch/util/AbstractChecker.java * (edit) src/java/org/apache/nutch/indexer/IndexingFiltersChecker.java * (edit) src/java/org/apache/nutch/parse/ParserChecker.java > ParserChecker and IndexingFiltersChecker to check robots.txt > > > Key: NUTCH-2002 > URL: https://issues.apache.org/jira/browse/NUTCH-2002 > Project: Nutch > Issue Type: Improvement > Components: parser >Affects Versions: 1.9 >Reporter: Julien Nioche >Assignee: Sebastian Nagel >Priority: Minor > Fix For: 1.17 > > Attachments: NUTCH-2002.patch > > > ParserChecker could check whether a given URL is allowed by the robots.txt > directives. -- This message was sent by Atlassian Jira (v8.3.4#803005)