[ http://issues.apache.org/jira/browse/NUTCH-359?page=comments#action_12433315 ] Otis Gospodnetic commented on NUTCH-359: ----------------------------------------
Looks fine and simple (and has a small typo in the last comment). Sami is doing 0.8.1 soon, so I won't mess with this now. > extraction of links will fail for whole page if one single link cannot be > parsed > -------------------------------------------------------------------------------- > > Key: NUTCH-359 > URL: http://issues.apache.org/jira/browse/NUTCH-359 > Project: Nutch > Issue Type: Bug > Components: fetcher > Affects Versions: 0.8 > Environment: Ubuntu Dapper > Reporter: Renaud Richardet > Priority: Minor > Attachments: outlink.diff > > > When Nutch parses the outlinks of a fetched page, the process will fail if a > single link cannot be parsed (e.g. java.net.MalformedURLException: unknown > protocol). The attached patch will keep indexing the remaining links on that > page even if one fails. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
