Sebastian Nagel created NUTCH-3136:
--------------------------------------

             Summary: Upgrade crawler-commons dependency
                 Key: NUTCH-3136
                 URL: https://issues.apache.org/jira/browse/NUTCH-3136
             Project: Nutch
          Issue Type: Improvement
          Components: dependency, robots, sitemap, util
    Affects Versions: 1.22
            Reporter: Sebastian Nagel
            Assignee: Sebastian Nagel
             Fix For: 1.22


Crawler-commons 
[1.6|https://github.com/crawler-commons/crawler-commons/releases/tag/crawler-commons-1.6]
 has been released recently, the last upgrade of the crawler-commons dependency 
was to 1.4 in NUTCH-2995.

The upgrade should also include the switch to methods of the Robots parser, 
taking URL objects as parameters and such avoiding repeatedly parsing of URLs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to