Hi, Is this issue resolved in https://issues.apache.org/jira/browse/NUTCH-1044 for the case when db.ignore.external.links set to true ?
Thanks. Alex. -----Original Message----- From: Ferdy Galema <ferdy.gal...@kalooga.com> To: user <user@nutch.apache.org> Sent: Thu, Nov 17, 2011 6:01 am Subject: Re: http.redirect.max Thanks for updating the list. On 11/17/2011 02:52 PM, Rafael Pappert wrote: > Hi, > > after some investigation i got the problem. > I had db.ignore.external.links set to true, this is why > fetcher isn't following the redirection from domain.com to > www.domain.com. > > Rafael. > > > > On 16/Nov/ 2011, at 20:17 , Rafael Pappert wrote: > >> Hello List, >> >> is it possible to follow http 301 redirects immediately? >> >> I tried to set http.redirect.max to 3 but the page is >> still not indexed. readdb is still showing 1 page is >> unfetched / db_redir_perm. And I can't find the >> redirection target in the crawldb. >> >> How does nutch handle redirects? >> >> Thanks in advance, >> Rafael. >> >> >> >>