Relationship with streaming?

when defining the implementation for speculative re-execution, please take into account how Streaming is implemented. Specifically, streaming tasks are supposed run in their own working directories and read and write (temporary) (Unix) files there. There may be some details that are important for speculative re-execution implementation.

On Nov 7, 2006, at 7:24 AM, Devaraj Das (JIRA) wrote:

[ http://issues.apache.org/jira/browse/HADOOP-76? page=comments#action_12447830 ]

Devaraj Das commented on HADOOP-76:
-----------------------------------

One more comment - need to document somewhere how exactly (what config params) a user of the PhasedFileSystem (a map method) can access the JobId, TaskId and TIPId when he wants to create an instance of PhasedFileSystem. Better yet, add a new constructor in PhasedFileSystem that takes a JobConf object. Inside that constructor, you can get the JobId, TaskId and TIPId values and proceed. The user doesn't have to bother about details in this case.

Implement speculative re-execution of reduces
---------------------------------------------

                Key: HADOOP-76
                URL: http://issues.apache.org/jira/browse/HADOOP-76
            Project: Hadoop
         Issue Type: Improvement
         Components: mapred
   Affects Versions: 0.1.0
           Reporter: Doug Cutting
        Assigned To: Sanjay Dahiya
           Priority: Minor
Attachments: Hadoop-76.patch, Hadoop-76_1.patch, spec_reducev.patch


As a first step, reduce task outputs should go to temporary files which are renamed when the task completes.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



Reply via email to