Hi everybody
I have an unusual issue. when i run nutch on top off hadoop, after the
map tasks finish, the reduce task start to finish very fast almost all
of them finish in less than 2 hours but there is alway one or two that
take a lot longer. this is a link to the list of a completed reduce
tasks ( that is all of them for that fetch job) and you can see on the
list that the last one took more than 18 hours to finish and there is
another one that took more than 6 hours. does any body have any idea why
this is happening?
http://plutooz.com/hadoop.html
p.s. this fetch job had about 1.5 million pages in it.
thanks,