Hi Meraj

You could call jstack on the Java process a couple of times to see what it
is busy doing, that will be a simple of way of checking that this is indeed
the source of the problem.
See https://issues.apache.org/jira/browse/NUTCH-1314 for a possible
solution

J.

On 16 October 2014 06:08, Meraj A. Khan <[email protected]> wrote:

> Hi All,
>
> I am running into a situation where the reduce phase of the fetch job with
> parsing enabled at the time of fetch is taking excessively long amount of
> time , I have seen recommendations to filter the URLs based on length to
> avoid normalization related delays ,I am not filtering any URLs based on
> length , could that be an issue ?
>
> Can anyone share if they faced this issue and what the resolution was, I am
> running Nutch 1.7 on Hadoop YARN.
>
> The issue was previously inconclusively discussed here.
>
>
> http://markmail.org/message/p6dzvvycpfzbaugr#query:+page:1+mid:p6dzvvycpfzbaugr+state:results
>
> Thanks.
>



-- 

Open Source Solutions for Text Engineering

http://digitalpebble.blogspot.com/
http://www.digitalpebble.com
http://twitter.com/digitalpebble

Reply via email to