[ 
http://issues.apache.org/jira/browse/HADOOP-76?page=comments#action_12445321 ] 
            
Owen O'Malley commented on HADOOP-76:
-------------------------------------

1. This patch doesn't apply cleanly anymore. The TaskTracker.java has a 
conflicting change.

2. I agree that it would be nice if FileSystem was an interface, but that is 
way out of scope for this bug. 

You should derive PhasedFileSystem from FileSystem. It should have a "base" 
FileSystem that does the real work and a Map<String,String> that maps "final" 
filenames to "temporary" filenames. 

PhasedFileSystem will need to implement the abstract "Raw" methods from 
FileSystem. The read operations, can just be sent directly to the base 
FileSystem. createRaw should create a temporary name, put it in the map and 
create the file using the base FileSystem. The other "modification" methods 
(renameRaw, deleteRaw) should throw UnsupportedOperationException.

To ensure that the files are cleaned up correctly, I'd suggest that the 
temporary files be stored under:

<system dir>/<job id>/<tip id>/<task id>/<unique file id>

(in dfs clearly) so that when the job is finished, we just need to delete the 
job directory to clean up any remains of failed tasks that didn't clean up 
properly.

there should also be:
  public void commit() throws IOException {...}
  public void abort() throws IOException {...}
  public void close() throws IOException { abort(); }

For close, it should lock the tip directory, move the files into place, put a 
DONE touch file in the tip directory and delete the task id directory.

> Implement speculative re-execution of reduces
> ---------------------------------------------
>
>                 Key: HADOOP-76
>                 URL: http://issues.apache.org/jira/browse/HADOOP-76
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.1.0
>            Reporter: Doug Cutting
>         Assigned To: Sanjay Dahiya
>            Priority: Minor
>         Attachments: Hadoop-76.patch, spec_reducev.patch
>
>
> As a first step, reduce task outputs should go to temporary files which are 
> renamed when the task completes.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to