Hi kaveh,

We have recently been informed about parsing taking forever and a day
in the reduce phase. This is currently being investigated. FYI the
thread can be found below

http://www.mail-archive.com/user%40nutch.apache.org/msg06560.html

I wonder if you have looked into this and if there is a more general
link between such issues?

Lewis

On Wed, Jun 13, 2012 at 1:31 AM, kaveh minooie <[email protected]> wrote:
> Hi everybody
>
> I have an unusual issue. when i run nutch on top off hadoop, after the map
> tasks finish, the reduce task start to finish very fast almost all of them
> finish in less than 2 hours but there is alway one or two that take a lot
> longer. this is a link to the list of a completed reduce tasks ( that is all
> of them for that fetch job) and you can see on the list that the last one
> took more than 18 hours to finish and there is another one that took more
> than 6 hours. does any body have any idea why this is happening?
>
> http://plutooz.com/hadoop.html
>
> p.s. this fetch job had about 1.5 million pages in it.
>
> thanks,



-- 
Lewis

Reply via email to