Hi,

Is this issue resolved in https://issues.apache.org/jira/browse/NUTCH-1044
for the case when 
db.ignore.external.links set to true
?

Thanks.
Alex.


 

 

-----Original Message-----
From: Ferdy Galema <ferdy.gal...@kalooga.com>
To: user <user@nutch.apache.org>
Sent: Thu, Nov 17, 2011 6:01 am
Subject: Re: http.redirect.max


Thanks for updating the list.

On 11/17/2011 02:52 PM, Rafael Pappert wrote:
> Hi,
>
> after some investigation i got the problem.
> I had db.ignore.external.links set to true, this is why
> fetcher isn't following the redirection from domain.com to
> www.domain.com.
>
> Rafael.
>
>
>
> On 16/Nov/ 2011, at 20:17 , Rafael Pappert wrote:
>
>> Hello List,
>>
>> is it possible to follow http 301 redirects immediately?
>>
>> I tried to set http.redirect.max to 3 but the page is
>> still not indexed. readdb is still showing 1 page is
>> unfetched / db_redir_perm. And I can't find the
>> redirection target in the crawldb.
>>
>> How does nutch handle redirects?
>>
>> Thanks in advance,
>> Rafael.
>>
>>
>>
>>

 

Reply via email to