[ 
https://issues.apache.org/jira/browse/NUTCH-2196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Markus Jelsma updated NUTCH-2196:
---------------------------------
    Attachment: NUTCH-2196.patch

Patch for trunk introducing the -normalize flag. If enabled, input URL's are 
passed through configured normalizers (SCOPE DEFAULT). So it is possible to 
input unencoded URL's etc.

Removed URLUtil so it is no also possible to input both encoded as well as 
unencoded URL's at the same time!

Will commit shortly

> IndexingFilterChecker to optionally normalize
> ---------------------------------------------
>
>                 Key: NUTCH-2196
>                 URL: https://issues.apache.org/jira/browse/NUTCH-2196
>             Project: Nutch
>          Issue Type: Improvement
>            Reporter: Markus Jelsma
>            Assignee: Markus Jelsma
>            Priority: Trivial
>             Fix For: 1.12
>
>         Attachments: NUTCH-2196.patch
>
>
> As mentioned in NUTCH-2194, we sometimes use it as a backend for a web 
> application. If so, then end users are obviously going to input bad URL's so 
> having a normalizer running would smooth user satisfaction.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to