Sebastian Nagel updated NUTCH-2509:
    Fix Version/s: 1.15

> Inconsistent behavior in SitemapProcessor
> -----------------------------------------
>                 Key: NUTCH-2509
>                 URL: https://issues.apache.org/jira/browse/NUTCH-2509
>             Project: Nutch
>          Issue Type: Bug
>          Components: sitemap
>    Affects Versions: 1.14
>            Reporter: Yossi Tamari
>            Priority: Minor
>             Fix For: 1.15
>         Attachments: SitemapProcessor.patch
> There are two inconsistent behaviors in SitemapProcessor:
>  # There is a member variable maxRedir that is supposed to limit the number 
> of redirections on sitemap URLs, and it is initialized from config property 
> sitemap.redir.max, but it is ignored in the code because a local variable 
> with the same name is defined in the relevant method, and is always set to 3.
>  # When a sitemap URL goes through redirect, it is filtered and normalized. 
> However, if a sitemap URL comes from a sitemapindex, it is not. This seems 
> inconsistent, as in both cases we have a URL from an outside source.

This message was sent by Atlassian JIRA

Reply via email to