Re: Re-reduce, without re-map

Mathijs Homminga Tue, 03 Apr 2007 03:27:40 -0700

Each reduce task (Nutch indexing job) gets as far as 66%, and then fails with 
the following error:


"Task failed to report status for 600 seconds. Killing."

In the end, no reduce task completes successfully.Besides solves this issue, I was wondering if I could update code and configuration and start the reduce phase again without the need to redo all map tasks (that saves me 2 hours). Assuming of course that the output of the map tasks has not changed.


Mathijs



Arun C Murthy wrote:

Hi Mathijs,

Mathijs Homminga wrote:
We have some troubles with the reduce phase of our job.
Is it possible to re-execute the reduce tasks without the need to doall map tasks again?
That the MR-framework already does... you don't have to re-executethe maps for the *failed* reduces. Are you noticing something else?
What are the 'troubles' you allude to? Also with once we getHADOOP-1127 in, you should try turing on 'speculative execution' -that helps when some tasks are very slow w.r.t other similar tasks.
Arun
Thanks!
Mathijs Homminga

Re: Re-reduce, without re-map

Reply via email to