Re: nutch redirection issue

Bai Shen Thu, 11 Jul 2013 04:22:34 -0700

Have you done another crawl?  By default, Nutch puts the redirect into the
database as a new url to be crawled.  So you will find the content under
the location of the redirect.

If I remember correctly, there used to be a setting that would have Nutch
follow the redirect instead of storing it as a new url, but I can't seem to
find it at the moment.

On Thu, Jul 11, 2013 at 5:48 AM, devang pandey <[email protected]>wrote:

> Hello,
>
> I am bit new to nutch . Thing is I am crawling a url which redirects to
> another url .Now when analysing my crawl results I get content of first url
> along with status code : temp redirected to (second url name) . Now my
> question is that why I am not getting content and details of that second
> url . Please help
>

Re: nutch redirection issue

Reply via email to