Hello Jane,

On Tue, Mar 29, 2011 at 4:40 AM, Jane Chen <[email protected]> wrote:
> There are times when I don't have an accurate count of the total records to 
> be processed, and I wonder the impact on task scheduling when returning an 
> inaccurate progress percentage.  I found that when I return either 0 when not 
> done and 1 when done will make the job hang.

What do you mean when you say the job 'hangs' when you statically set
it to 0 or 1 always? Do you mean the task gets killed and restarted?

When progress or status message changes are made, a Task status report
is sent back via the reporter to the TIP object held by the parent
TaskTracker. In case a TIP has not received task reports in a while,
it can go ahead and purge the task claiming that it has hung or gone
unresponsive (mapred.task.timeout, 600s by default - set to 0 to never
let it purge) and it gets rescheduled.

If you're not sure what your progress is while processing stuff in RR,
set progress to a random value; it shouldn't matter to the framework
if the progress decreases in value.

-- 
Harsh J

Reply via email to