[
https://issues.apache.org/jira/browse/NUTCH-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13067989#comment-13067989
]
Markus Jelsma commented on NUTCH-1014:
--------------------------------------
Good catch! Performance seems good indeed, it offers Unicode support and
negative look-behind. I'll check out the API.
> Migrate from Apache ORO to java.util.regex
> ------------------------------------------
>
> Key: NUTCH-1014
> URL: https://issues.apache.org/jira/browse/NUTCH-1014
> Project: Nutch
> Issue Type: Improvement
> Reporter: Markus Jelsma
> Fix For: 1.4, 2.0
>
>
> A separate issue tracking migration of all components from Apache ORO to
> java.util.regex. Components involved are:
> - RegexURLNormalzier
> - OutlinkExtractor
> - JSParseFilter
> - MoreIndexingFilter
> - BasicURLNormalizer
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira