[
https://issues.apache.org/jira/browse/HADOOP-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12599749#action_12599749
]
Devaraj Das commented on HADOOP-3150:
-------------------------------------
An approach:
1) The task framework code knows when it is done with the task. If the
outputformat is FileOutputFormat, at that point of time it sets the state of
itself to COMMIT_PENDING and sends that status out to the tasktracker.
2) The tasktracker forwards this status to the JobTracker and the JobTracker
notes it
2.1) sends a COMMITTASKACTION for that task if this is the first such
attempt trying to COMMIT
2.2) kills all other task attempts and if two status messages with
COMMIT_PENDING come at the same time from two running attempts, the first one
wins.
3) The tasktracker gets the COMMITTASKACTION and notes it. The task doesn't
commit the output until the tasktracker agrees to it (a new RPC like
canCommit())
4) If the commit fails, the task attempt fails. The JobTracker then reexecutes
that.
> Move task file promotion into the task
> --------------------------------------
>
> Key: HADOOP-3150
> URL: https://issues.apache.org/jira/browse/HADOOP-3150
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Reporter: Owen O'Malley
> Assignee: Devaraj Das
> Fix For: 0.18.0
>
>
> We need to move the task file promotion from the JobTracker to the Task and
> move it down into the output format.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.