Delegate parsing of robots.txt to crawler-commons
-------------------------------------------------
Key: NUTCH-1031
URL: https://issues.apache.org/jira/browse/NUTCH-1031
Project: Nutch
Issue Type: Task
Reporter: Julien Nioche
Assignee: Julien Nioche
Priority: Minor
Fix For: 1.4, 2.0
We're about to release the first version of Crawler-Commons
[http://code.google.com/p/crawler-commons/] which contains a parser for
robots.txt files. This parser should also be better than the one we currently
have in Nutch. I will delegate this functionality to CC as soon as it is
available publicly
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira