[ https://issues.apache.org/jira/browse/NUTCH-685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Lewis John McGibbney updated NUTCH-685: --------------------------------------- Fix Version/s: 2.2 1.7 > Content-level redirect status lost in ParseSegment > -------------------------------------------------- > > Key: NUTCH-685 > URL: https://issues.apache.org/jira/browse/NUTCH-685 > Project: Nutch > Issue Type: Bug > Affects Versions: 1.0.0 > Reporter: Andrzej Bialecki > Assignee: Andrzej Bialecki > Fix For: 1.7, 2.2 > > > When Fetcher runs in parsing mode, content-level redirects (HTML meta tag > "Refresh") are properly discovered and recorded in crawl_fetch under source > URL and target URL. If Fetcher runs in non-parsing mode, and ParseSegment is > run as a separate step, the content-level redirection data is used only to > add the new (target) URL, but the status of the original URL is not reset to > indicate a redirect. Consequently, status of the original URL will be > different depending on the way you run Fetcher, whereas it should be the same. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira