[ 
https://issues.apache.org/jira/browse/HADOOP-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12599749#action_12599749
 ] 

Devaraj Das commented on HADOOP-3150:
-------------------------------------

An approach:
1) The task framework code knows when it is done with the task. If the 
outputformat is FileOutputFormat, at that point of time it sets the state of 
itself to COMMIT_PENDING and sends that status out to the tasktracker. 
2) The tasktracker forwards this status to the JobTracker and the JobTracker 
notes it
    2.1) sends a COMMITTASKACTION for that task if this is the first such 
attempt trying to COMMIT
    2.2) kills all other task attempts and if two status messages with 
COMMIT_PENDING come at the same time from two running attempts, the first one 
wins.
3) The tasktracker gets the COMMITTASKACTION and notes it. The task doesn't 
commit the output until the tasktracker agrees to it (a new RPC like 
canCommit())
4) If the commit fails, the task attempt fails. The JobTracker then reexecutes 
that.

> Move task file promotion into the task
> --------------------------------------
>
>                 Key: HADOOP-3150
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3150
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Owen O'Malley
>            Assignee: Devaraj Das
>             Fix For: 0.18.0
>
>
> We need to move the task file promotion from the JobTracker to the Task and 
> move it down into the output format.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to