> On 3 Feb 2017, at 20:02, Chris Douglas <chris.doug...@gmail.com> wrote:
> 
> It's been a long time, but IIRC this isn't going to be invoked. The AM
> will never set the preempt flag in the umbilical, so the task will
> never transition to this state.
> 
> MapReduce checkpoint/restart of reduce tasks was going to be part of
> MAPREDUCE-5269, which signals a ReduceTask to promote its partial
> output if both the Reducer and OutputCommitter are tagged as
> @Checkpointable. If either is not, then the flag is never set. The
> code that would have implemented this was not committed, so it's
> really-really not going to be set. -C

I didn't think it was being used, but thanks for clarifying this.

Should that code snippet be culled? Or at least the abort operation to actually 
call abortTask?

> 
> On Fri, Feb 3, 2017 at 6:41 AM, Steve Loughran <ste...@hortonworks.com> wrote:
>> 
>> In HADOOP-13786 I'm adding a new committer, one which writes to S3 without 
>> doing renames. It does this by submitting all the data to S3 targeted at the 
>> final destination, but doesn't send the POST needed to materialize it until 
>> the tasks commits. Abort the task and it cancels these pending commits.
>> 
>> this algorithm should be robust provided that only one attempt for a task is 
>> committed, which comes down to
>> 
>> 1.  Only those tasks which have succeeded are committed
>> 2   those tasks which have not succeeded have their pending writes aborted
>> 
>> 
>> Which is where I now have a question. In the class 
>> org.apache.hadoop.mapred.Task, OutputCommitter.commitTask() is called when a 
>> task is pre-empted:
>> 
>> 
>>  public void done(TaskUmbilicalProtocol umbilical,
>>                   TaskReporter reporter
>>                   ) throws IOException, InterruptedException {
>>    updateCounters();
>>    if (taskStatus.getRunState() == TaskStatus.State.PREEMPTED ) {
>>      // If we are preempted, do no output promotion; signal done and exit
>>      committer.commitTask(taskContext);         / * HERE */
>>      umbilical.preempted(taskId, taskStatus);
>>      taskDone.set(true);
>>      reporter.stopCommunicationThread();
>>      return;
>>    }
>> 
>> That's despite the line above saying "do no output promotion", and, judging 
>> by its place in the code, looking like it's the handler for task preempted 
>> state.
>> 
>> Shouldn't it be doing a task abort here?
>> 
>> I suspect the sole reason this hasn't shown up as a problem before is that 
>> this is the sole use of TaskStatus.State.PREEMPTED in the hadoop code: this 
>> particular codepath is never executed. In which case, culling it may be 
>> correct option.
>> 
>> Thoughts?
>> 
>> -Steve
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
>> For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
>> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

Reply via email to