Hi All,

I am running into a situation where the reduce phase of the fetch job with
parsing enabled at the time of fetch is taking excessively long amount of
time , I have seen recommendations to filter the URLs based on length to
avoid normalization related delays ,I am not filtering any URLs based on
length , could that be an issue ?

Can anyone share if they faced this issue and what the resolution was, I am
running Nutch 1.7 on Hadoop YARN.

The issue was previously inconclusively discussed here.

http://markmail.org/message/p6dzvvycpfzbaugr#query:+page:1+mid:p6dzvvycpfzbaugr+state:results

Thanks.

Reply via email to