Hi Canan,
Thank you for bringing this up, I just noticed that 2.x does not have the
configurable property in nutch-default.xml

<property>
  <name>http.redirect.max</name>
  <value>0</value>
  <description>The maximum number of redirects the fetcher will follow when
  trying to fetch a page. If set to negative or 0, fetcher won't immediately
  follow redirected URLs, instead it will record them for later fetching.
  </description>
</property>

I've also looked over the trunk and 2.x branches and it seems that with
regards to handling redirects, trunk is more functionally capable.
I don't have time to look into this just now.
You can begin looking in to the trunk code before the 2.x in an attempt to
see how redirects should be handled and how a configurable depth can be
specified for fetching of such URLs.
It seems that we need to add such functionality to 2.x.
Contributions would be very very welcome on this issue.
Lewis

On Mon, Mar 25, 2013 at 1:17 PM, Canan GİRGİN <[email protected]>wrote:

> Hi,
>
> I use "bin/nutch parsechecker" command.(Nutch 2.1)I works fine.But when I
> try parsechecker command with redirected page,parseFilters turns wrong
> results. Because parse text contains redirect descriptions.
>
> Is there any problem?
>
> Thanks, Canan
>
> Nutch 2.1 / Ubuntu 12.04 / MySQL
>



-- 
*Lewis*

Reply via email to