Just a question:
did we consider "speculative execution of reduces with sub-splitting"?
That is, for each of the last / slow reduce tasks run several task
subslitting the input of the original task?
The implementation would be probably more complicated, but the benefits
may be huge -- currently, it is not uncommon to see the single last
reduce task running for hours (if not for days).
-- ab
On Oct 19, 2006, at 10:43 AM, Sanjay Dahiya (JIRA) wrote:
[ http://issues.apache.org/jira/browse/HADOOP-76?page=all ]
Sanjay Dahiya updated HADOOP-76:
--------------------------------
Attachment: Hadoop-76.patch
This patch is up for review.
Here is the list of changes included in this patch -
Replaced recentTasks to a Map, added a new method in TaskInProgress
hasRanOnMachine, which looks at this Map and hasFailedOnMachines().
This is used to avoid scheduling multiple reduce instances of same
task on the same node.
Added a PhasedRecordWriter, which takes a RecordWriter, tempName,
finalName. Another option was to create a PhasedOutputFormat, this
seems more natural as it works with any existing OutputFormat and
RecordWriter. Records are written to tempName and when commit is
called they are moved to finalName.
ReduceTask.run() - if speculative execution is enabled then reduce
output is written to a temp location using PhasedRecordWriter. After
task finishes the output is written to a final location.
If some other speculative instance finishes first then
TaskInProgress.shouldCloseForClosedJob() returns true for the taskId.
On TaskTracker the task is killed by Process.destroy() so cleanup code
is in TaskTracker instead of Task. The cleanup of Maps happen in Conf,
which is probably misplaced. We could refactor this part for both Map
and Reduce and move cleanup code to some utility classes which, given
a Map and Reduce task track the files generated and cleanup if needed.
Added an extra attribute in TaskInProgress - runningSpeculative, to
avoid running more than one speculative instances of ReduceTask. Too
many Reduce instances for same task could increase load on Map
machines, this needs discussion. I can revert this change back to
allow some other number of instances of Reduces (MAX_TASK_FAILURES?).
comments
Implement speculative re-execution of reduces
---------------------------------------------
Key: HADOOP-76
URL: http://issues.apache.org/jira/browse/HADOOP-76
Project: Hadoop
Issue Type: Improvement
Components: mapred
Affects Versions: 0.1.0
Reporter: Doug Cutting
Assigned To: Sanjay Dahiya
Priority: Minor
Attachments: Hadoop-76.patch, spec_reducev.patch
As a first step, reduce task outputs should go to temporary files
which are renamed when the task completes.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the
administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira