Relationship with streaming?
when defining the implementation for speculative re-execution, please
take into account how Streaming is implemented.
Specifically, streaming tasks are supposed run in their own working
directories and read and write (temporary) (Unix) files there.
There may be some details that are important for speculative
re-execution implementation.
On Nov 7, 2006, at 7:24 AM, Devaraj Das (JIRA) wrote:
[
http://issues.apache.org/jira/browse/HADOOP-76?
page=comments#action_12447830 ]
Devaraj Das commented on HADOOP-76:
-----------------------------------
One more comment - need to document somewhere how exactly (what config
params) a user of the PhasedFileSystem (a map method) can access the
JobId, TaskId and TIPId when he wants to create an instance of
PhasedFileSystem. Better yet, add a new constructor in
PhasedFileSystem that takes a JobConf object. Inside that constructor,
you can get the JobId, TaskId and TIPId values and proceed. The user
doesn't have to bother about details in this case.
Implement speculative re-execution of reduces
---------------------------------------------
Key: HADOOP-76
URL: http://issues.apache.org/jira/browse/HADOOP-76
Project: Hadoop
Issue Type: Improvement
Components: mapred
Affects Versions: 0.1.0
Reporter: Doug Cutting
Assigned To: Sanjay Dahiya
Priority: Minor
Attachments: Hadoop-76.patch, Hadoop-76_1.patch,
spec_reducev.patch
As a first step, reduce task outputs should go to temporary files
which are renamed when the task completes.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the
administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira