[ https://issues.apache.org/jira/browse/NUTCH-1678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
James Sullivan updated NUTCH-1678: ---------------------------------- Attachment: 2.x.patch parse/OutlinkExtractor index-more parse-js urlnormalizer-basic Needs to be looked over and tested first. > Remove dependency on org.apache.oro > ----------------------------------- > > Key: NUTCH-1678 > URL: https://issues.apache.org/jira/browse/NUTCH-1678 > Project: Nutch > Issue Type: Improvement > Components: parser > Affects Versions: 2.2 > Reporter: James Sullivan > Priority: Minor > Labels: newbie, patch > Attachments: 2.x.patch > > > org.apache.oro has been archived for three years and it may be good to remove > the dependency as Java has had a built in regexes for quite some time now. > There don't seem to have been any specific Perl5 functionality needed in the > regexes so unless there are specific threading or performance reasons for > continuing to use oro it may be time to lose the dependency. Attached patch > needs to be checked thoroughly as I am rusty with Java and the unit tests are > sparse. -- This message was sent by Atlassian JIRA (v6.1#6144)