Outlinks are not properly normalized
------------------------------------

                 Key: NUTCH-1174
                 URL: https://issues.apache.org/jira/browse/NUTCH-1174
             Project: Nutch
          Issue Type: Bug
          Components: parser
    Affects Versions: 1.3
            Reporter: Markus Jelsma
            Assignee: Markus Jelsma
             Fix For: 1.5


In ParseOutputFormat, the toUrl is read from Outlink and is processed. This 
String object is filtered, normalized etc but the original Outlink object is 
actually added. The normalized url in toUrl is not written back to the Outlink 
object.

This issue adds a setUrl method to Outlink which is used in ParseOutputFormat 
to overwrite the unnormalized url.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to