[ 
https://issues.apache.org/jira/browse/NUTCH-2522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16387496#comment-16387496
 ] 

ASF GitHub Bot commented on NUTCH-2522:
---------------------------------------

okedoki opened a new pull request #290: NUTCH-2522
URL: https://github.com/apache/nutch/pull/290
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


>  Bidirectional URL exemption filter
> -----------------------------------
>
>                 Key: NUTCH-2522
>                 URL: https://issues.apache.org/jira/browse/NUTCH-2522
>             Project: Nutch
>          Issue Type: Improvement
>          Components: plugin
>            Reporter: Semyon Semyonov
>            Priority: Minor
>
> The current Nutch Url Exemption plugin exempts based on toUrl only, the new 
> plugin uses both fromUrl and toUrl and after the regex transformation, 
> exempts based on condition regex(fromUrl) == regex(toUrl).
> This approach allows us to perform more complex url exemption filter checks, 
> such as allow links:
> http://[www.website.com/|http://www.website.com/]home -> 
> http://[website.com/a|http://www.website.com/]bout ( with/without www).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to