Vishal Shah wrote:
Hi,
We upgraded our code to nutch 0.9 stable version along with hadoop 0.12.3,
which is the latest version of hadoop 0.12.
After the upgrade, I am seeing task failures during the reduce phase for
parse and fetch (without the parsing option) sometimes.
Usually, it's just one reduce task that creates this problem. The
jobtracker kills this task saying "Task failed to report status for 602
seconds. Killing task"
I tried running the task using IsolationRunner, and it works fine. I am
suspecting that there is probably a long computation happening during the
reduce phase for one of the keys due to which the tasktracker isn't able to
report status to the jobtracker in time.
If you suspect the long computation one way is to use the 'reporter'
parameter to your mapper/reducer to provide status updates and ensure
that the TaskTracker doesn't kill the task i.e. doesn't assume the task
has been lost.
hth,
Arun
I was wondering if anyone else has seen a similar problem and if there is
a fix for it.
Thanks,
-vishal.