[ 
https://issues.apache.org/jira/browse/NUTCH-2509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yossi Tamari updated NUTCH-2509:
--------------------------------
    Attachment: SitemapProcessor.patch

> Inconsistent behavior in SitemapProcessor
> -----------------------------------------
>
>                 Key: NUTCH-2509
>                 URL: https://issues.apache.org/jira/browse/NUTCH-2509
>             Project: Nutch
>          Issue Type: Bug
>          Components: sitemap
>    Affects Versions: 1.14
>            Reporter: Yossi Tamari
>            Priority: Minor
>         Attachments: SitemapProcessor.patch
>
>
> There are two inconsistent behaviors in SitemapProcessor:
>  # There is a member variable maxRedir that is supposed to limit the number 
> of redirections on sitemap URLs, and it is initialized from config property 
> sitemap.redir.max, but it is ignored in the code because a local variable 
> with the same name is defined in the relevant method, and is always set to 3.
>  # When a sitemap URL goes through redirect, it is filtered and normalized. 
> However, if a sitemap URL comes from a sitemapindex, it is not. This seems 
> inconsistent, as in both cases we have a URL from an outside source.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to